3 ways to write a function (unexpected performance difference and regression)
Consider the attached comparison.hs, a small performance test for difference lists. I was somewhat surprised by the performance differences between flatListCons and flatDList, and I tried to reproduce the issue by writing 3 equivalent versions of flatListCons.
flatListCons t = flat t []
where
flat (Leaf n) ns = n:ns
flat (Fork a b) ns = flat a (flat b ns)
flatListCons2 t = flat t []
where
flat (Leaf n) = \ns -> n:ns
flat (Fork a b) = \ns -> flat a (flat b ns)
flatListCons3 t = flat t []
where
flat (Leaf n) = (n:)
flat (Fork a b) = flat a . flat b
Not only does GHC not give equal performance for the 3 versions (factor of 2), but GHC-6.12.3 and GHC-7.1.20101022 differ in optimizations. With GHC-6.12.3, flatListCons and flatListCons2 are fast, flatListCons3 is slow. With GHC-7.1.20101022, only flatListCons is fast, flatListCons2 and flatListCons3 are slow:
$ time ./comparison-6.12.3.exe c 26
67108864
real 0m4.574s
user 0m0.015s
sys 0m0.358s
$ time ./comparison-6.12.3.exe c2 26
67108864
real 0m4.442s
user 0m0.046s
sys 0m0.312s
$ time ./comparison-6.12.3.exe c3 26
67108864
real 0m10.694s
user 0m0.046s
sys 0m0.328s
$ time ./comparison-7.1.20101022.exe c 26
67108864
real 0m4.473s
user 0m0.046s
sys 0m0.327s
$ time ./comparison-7.1.20101022.exe c2 26
67108864
real 0m10.791s
user 0m0.046s
sys 0m0.327s
$ time ./comparison-7.1.20101022.exe c3 26
67108864
real 0m10.729s
user 0m0.078s
sys 0m0.280s
Ideally, I'd like all three versions (as well as flatDList) to be equally fast.
Trac metadata
| Trac field | Value |
|---|---|
| Version | 7.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture |