~100% performance regression in HEAD compared to ghc6.12, ~22% compared to 7.0.4
Testing following simple program with ghc 7.3.20110927 reveals quite big performance regression compared to ghc6, and smaller one compared to 7.0.4:
lcs3 :: Eq a => [a] -> [a] -> [a]
lcs3 a b = fst $ aux (a, length a) (b, length b)
where
aux (_,0) _ = ([],0)
aux _ (_,0) = ([],0)
aux (a@(ha:as),la) (b@(hb:bs), lb)
| ha == hb = let (s,n) = aux (as,la-1) (bs,lb-1) in (ha : s, n+1)
| otherwise =
let (sa,na) = aux (as,la-1) (b,lb)
(sb,nb) = aux (a,la) (bs,lb-1) in
if na > nb then (sa,na) else (sb,nb)
main = do putStrLn . show $ lcs3 [1..20] [10..20]
on ghc6.12:
tomaszw@frodo: ~ $ time ./LCS +RTS -s
./LCS +RTS -s
[10,11,12,13,14,15,16,17,18,19,20]
888,979,332 bytes allocated in the heap
647,648 bytes copied during GC
31,116 bytes maximum residency (1 sample(s))
33,060 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Generation 0: 1695 collections, 0 parallel, 0.01s, 0.01s elapsed
Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed
INIT time 0.00s ( 0.00s elapsed)
MUT time 1.12s ( 1.13s elapsed)
GC time 0.01s ( 0.01s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 1.13s ( 1.14s elapsed)
%GC time 1.1% (0.7% elapsed)
Alloc rate 793,684,067 bytes per MUT second
Productivity 98.9% of total user, 98.3% of total elapsed
real 0m1.141s
user 0m1.132s
sys 0m0.008s
on ghc 7.0.4:
tomaszw@frodo: ~ $ time ./LCS +RTS -s
./LCS +RTS -s
[10,11,12,13,14,15,16,17,18,19,20]
4,338,709,876 bytes allocated in the heap
5,603,080 bytes copied during GC
31,892 bytes maximum residency (1 sample(s))
109,240 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Generation 0: 8275 collections, 0 parallel, 0.05s, 0.05s elapsed
Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed
INIT time 0.00s ( 0.00s elapsed)
MUT time 1.86s ( 1.87s elapsed)
GC time 0.05s ( 0.05s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 1.91s ( 1.91s elapsed)
%GC time 2.5% (2.4% elapsed)
Alloc rate 2,331,159,557 bytes per MUT second
Productivity 97.5% of total user, 97.3% of total elapsed
real 0m1.913s
user 0m1.896s
sys 0m0.012s
on ghc7.3.20110927:
tomaszw@frodo: ~ $ ghc --make -O2 -rtsopts LCS.hs
[1 of 1] Compiling Main ( LCS.hs, LCS.o )
Linking LCS ...
tomaszw@frodo: ~ $ time ./LCS +RTS -s
[10,11,12,13,14,15,16,17,18,19,20]
1,467,845,016 bytes allocated in the heap
1,211,836 bytes copied during GC
60,144 bytes maximum residency (1 sample(s))
22,744 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 2810 colls, 0 par 0.01s 0.01s 0.0000s 0.0000s
Gen 1 1 colls, 0 par 0.00s 0.00s 0.0002s 0.0002s
INIT time 0.00s ( 0.00s elapsed)
MUT time 2.37s ( 2.38s elapsed)
GC time 0.01s ( 0.01s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 2.39s ( 2.39s elapsed)
%GC time 0.6% (0.6% elapsed)
Alloc rate 619,142,210 bytes per MUT second
Productivity 99.4% of total user, 99.2% of total elapsed
real 0m2.392s
user 0m2.384s
sys 0m0.008s
I also tried the llvm backed on newer ghc, but got equivalent numbers.
Trac metadata
Trac field | Value |
---|---|
Version | 7.3 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |