3 ways to write a function (unexpected performance difference and regression)
Consider the attached comparison.hs
, a small performance test for difference lists. I was somewhat surprised by the performance differences between flatListCons
and flatDList
, and I tried to reproduce the issue by writing 3 equivalent versions of flatListCons
.
flatListCons t = flat t []
where
flat (Leaf n) ns = n:ns
flat (Fork a b) ns = flat a (flat b ns)
flatListCons2 t = flat t []
where
flat (Leaf n) = \ns -> n:ns
flat (Fork a b) = \ns -> flat a (flat b ns)
flatListCons3 t = flat t []
where
flat (Leaf n) = (n:)
flat (Fork a b) = flat a . flat b
Not only does GHC not give equal performance for the 3 versions (factor of 2), but GHC-6.12.3 and GHC-7.1.20101022 differ in optimizations. With GHC-6.12.3, flatListCons
and flatListCons2
are fast, flatListCons3
is slow. With GHC-7.1.20101022, only flatListCons
is fast, flatListCons2
and flatListCons3
are slow:
$ time ./comparison-6.12.3.exe c 26
67108864
real 0m4.574s
user 0m0.015s
sys 0m0.358s
$ time ./comparison-6.12.3.exe c2 26
67108864
real 0m4.442s
user 0m0.046s
sys 0m0.312s
$ time ./comparison-6.12.3.exe c3 26
67108864
real 0m10.694s
user 0m0.046s
sys 0m0.328s
$ time ./comparison-7.1.20101022.exe c 26
67108864
real 0m4.473s
user 0m0.046s
sys 0m0.327s
$ time ./comparison-7.1.20101022.exe c2 26
67108864
real 0m10.791s
user 0m0.046s
sys 0m0.327s
$ time ./comparison-7.1.20101022.exe c3 26
67108864
real 0m10.729s
user 0m0.078s
sys 0m0.280s
Ideally, I'd like all three versions (as well as flatDList
) to be equally fast.
Trac metadata
Trac field | Value |
---|---|
Version | 7.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |