Skip to content

3 ways to write a function (unexpected performance difference and regression)

Consider the attached comparison.hs, a small performance test for difference lists. I was somewhat surprised by the performance differences between flatListCons and flatDList, and I tried to reproduce the issue by writing 3 equivalent versions of flatListCons.

flatListCons t = flat t []
  where
  flat (Leaf n)   ns = n:ns
  flat (Fork a b) ns = flat a (flat b ns) 

flatListCons2 t = flat t []
  where
  flat (Leaf n)   = \ns -> n:ns
  flat (Fork a b) = \ns -> flat a (flat b ns) 

flatListCons3 t = flat t []
  where
  flat (Leaf n)   = (n:)
  flat (Fork a b) = flat a . flat b

Not only does GHC not give equal performance for the 3 versions (factor of 2), but GHC-6.12.3 and GHC-7.1.20101022 differ in optimizations. With GHC-6.12.3, flatListCons and flatListCons2 are fast, flatListCons3 is slow. With GHC-7.1.20101022, only flatListCons is fast, flatListCons2 and flatListCons3 are slow:

$ time ./comparison-6.12.3.exe c 26
67108864

real    0m4.574s
user    0m0.015s
sys     0m0.358s

$ time ./comparison-6.12.3.exe c2 26
67108864

real    0m4.442s
user    0m0.046s
sys     0m0.312s

$ time ./comparison-6.12.3.exe c3 26
67108864

real    0m10.694s
user    0m0.046s
sys     0m0.328s

$ time ./comparison-7.1.20101022.exe c 26
67108864

real    0m4.473s
user    0m0.046s
sys     0m0.327s

$ time ./comparison-7.1.20101022.exe c2 26
67108864

real    0m10.791s
user    0m0.046s
sys     0m0.327s

$ time ./comparison-7.1.20101022.exe c3 26
67108864

real    0m10.729s
user    0m0.078s
sys     0m0.280s

Ideally, I'd like all three versions (as well as flatDList) to be equally fast.

Trac metadata
Trac field Value
Version 7.1
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information