Skip to content

~100% performance regression in HEAD compared to ghc6.12, ~22% compared to 7.0.4

Testing following simple program with ghc 7.3.20110927 reveals quite big performance regression compared to ghc6, and smaller one compared to 7.0.4:

lcs3 :: Eq a => [a] -> [a] -> [a]
lcs3 a b = fst $ aux (a, length a) (b, length b)
  where
    aux (_,0) _ = ([],0)
    aux _ (_,0) = ([],0)
    aux (a@(ha:as),la) (b@(hb:bs), lb)
      | ha == hb  = let (s,n) = aux (as,la-1) (bs,lb-1) in (ha : s, n+1)
      | otherwise =
        let (sa,na) = aux (as,la-1) (b,lb)
            (sb,nb) = aux (a,la) (bs,lb-1) in
        if na > nb then (sa,na) else (sb,nb)

main = do putStrLn . show $ lcs3 [1..20] [10..20]

on ghc6.12:

tomaszw@frodo: ~ $ time ./LCS +RTS -s
./LCS +RTS -s 
[10,11,12,13,14,15,16,17,18,19,20]
     888,979,332 bytes allocated in the heap
         647,648 bytes copied during GC
          31,116 bytes maximum residency (1 sample(s))
          33,060 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  1695 collections,     0 parallel,  0.01s,  0.01s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    1.12s  (  1.13s elapsed)
  GC    time    0.01s  (  0.01s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    1.13s  (  1.14s elapsed)

  %GC time       1.1%  (0.7% elapsed)

  Alloc rate    793,684,067 bytes per MUT second

  Productivity  98.9% of total user, 98.3% of total elapsed


real	0m1.141s
user	0m1.132s
sys	0m0.008s

on ghc 7.0.4:

tomaszw@frodo: ~ $ time ./LCS +RTS -s
./LCS +RTS -s 
[10,11,12,13,14,15,16,17,18,19,20]
   4,338,709,876 bytes allocated in the heap
       5,603,080 bytes copied during GC
          31,892 bytes maximum residency (1 sample(s))
         109,240 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  8275 collections,     0 parallel,  0.05s,  0.05s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    1.86s  (  1.87s elapsed)
  GC    time    0.05s  (  0.05s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    1.91s  (  1.91s elapsed)

  %GC time       2.5%  (2.4% elapsed)

  Alloc rate    2,331,159,557 bytes per MUT second

  Productivity  97.5% of total user, 97.3% of total elapsed


real	0m1.913s
user	0m1.896s
sys	0m0.012s

on ghc7.3.20110927:

tomaszw@frodo: ~ $ ghc --make -O2  -rtsopts LCS.hs

[1 of 1] Compiling Main             ( LCS.hs, LCS.o )
Linking LCS ...
tomaszw@frodo: ~ $ time ./LCS +RTS -s
[10,11,12,13,14,15,16,17,18,19,20]
   1,467,845,016 bytes allocated in the heap
       1,211,836 bytes copied during GC
          60,144 bytes maximum residency (1 sample(s))
          22,744 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2810 colls,     0 par    0.01s    0.01s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.00s    0.00s     0.0002s    0.0002s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.37s  (  2.38s elapsed)
  GC      time    0.01s  (  0.01s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    2.39s  (  2.39s elapsed)

  %GC     time       0.6%  (0.6% elapsed)

  Alloc rate    619,142,210 bytes per MUT second

  Productivity  99.4% of total user, 99.2% of total elapsed


real	0m2.392s
user	0m2.384s
sys	0m0.008s

I also tried the llvm backed on newer ghc, but got equivalent numbers.

Trac metadata
Trac field Value
Version 7.3
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information