ghc -O space leak: CSE between different CAFs
Consider the following program for generating the 1,000,000th prime:
module Main (main) where sieve0 (n:ns) (p:ps) | p*p == n = sieve0 (filter (\n -> n`mod`p /= 0) ns) ps | otherwise = n:sieve0 ns (p:ps) primes0 :: [Int] primes0 = 3:sieve0 [5,7..] primes0 primes :: [Int] primes = 2:3:sieve0 [5,7..] primes0 main = print $ primes !! 1000000
The intention of the separate primes0 function is to limit the number of primes that need to be held in memory. Unfortunately, ghc -O combines the common subexpressions in primes0 and primes so this effort is wasted. The resulting program needs noticably more memory than required, as can be seen by replacing the definition of
primes = 2:3:sieve0 (iterate (2+) 5) primes0
which prevents the CSE from happening.
+RTS -s output, original version:
12,099,221,160 bytes allocated in the heap 279,720,116 bytes copied during GC 15,834,912 bytes maximum residency (6 sample(s))
modified version that prevents CSE:
127,736,408 bytes copied during GC (scavenged) 233,388 bytes copied during GC (not scavenged) 30,624 bytes maximum residency (1 sample(s))
Tested with ghc 6.4.2 and a current (as of 2006-10-17) darcs ghc 6.5.