Skip to content

ghc -O space leak: CSE between different CAFs

Consider the following program for generating the 1,000,000th prime:

module Main (main) where

sieve0 (n:ns) (p:ps)
    | p*p == n  = sieve0 (filter (\n -> n`mod`p /= 0) ns) ps
    | otherwise = n:sieve0 ns (p:ps)

primes0 :: [Int]
primes0 = 3:sieve0 [5,7..] primes0

primes :: [Int]
primes = 2:3:sieve0 [5,7..] primes0

main = print $ primes !! 1000000

The intention of the separate primes0 function is to limit the number of primes that need to be held in memory. Unfortunately, ghc -O combines the common subexpressions in primes0 and primes so this effort is wasted. The resulting program needs noticably more memory than required, as can be seen by replacing the definition of primes by

primes = 2:3:sieve0 (iterate (2+) 5) primes0

which prevents the CSE from happening.

Excerpt of +RTS -s output, original version:

12,099,221,160 bytes allocated in the heap
279,720,116 bytes copied during GC
 15,834,912 bytes maximum residency (6 sample(s))

modified version that prevents CSE:

127,736,408 bytes copied during GC (scavenged)
    233,388 bytes copied during GC (not scavenged)
     30,624 bytes maximum residency (1 sample(s))

Tested with ghc 6.4.2 and a current (as of 2006-10-17) darcs ghc 6.5.

Trac metadata
Trac field Value
Version 6.5
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information