Skip to content

GC reports memory in use way below the actual

The following program encodes and decodes a long list of words. The memory in use reported by the GC seems to be off by multiple gigabytes when compared to the reports of the OS. Results shown below. ghc-7.10.2, binary-0.7.6.1.

import Control.Exception (evaluate)
import Control.Monad (void)
import Data.Binary (encode, decode)
import qualified Data.ByteString.Lazy as BSL
import Data.List (isPrefixOf, foldl')
import Data.Word (Word32)
import GHC.Stats
import System.Mem (performGC)

type T = (Word32,[Word32])

main :: IO ()
main = do
  let sz = 1024 * 1024 * 15
      xs = [ (i,[i]) :: T | i <- [0 .. sz] ]
      bs = encode xs

  void $ evaluate $ sum' $ map (\(x, vs) -> x + sum' vs) xs
  putStrLn "After building the value to encode:"
  printMem

  putStrLn $ "Size of the encoded value: " ++
    show (BSL.length bs `div` (1024 * 1024)) ++ " MB"
  putStrLn ""

  putStrLn "After encoding the value:"
  printMem

  let xs' = decode bs :: [T]
  void $ evaluate $ sum' $ map (\(x, vs) -> x + sum' vs) xs'
  putStrLn "After decoding the value:"
  printMem

  -- retain the original list so it is not GC'ed
  void $ evaluate $ last xs
  -- retain the decoded list so it is not GC'ed
  void $ evaluate $ last xs'

printMem :: IO ()
printMem = do
  performGC
  readFile "/proc/self/status" >>=
    putStr . unlines . filter (\x -> any (`isPrefixOf` x) ["VmHWM", "VmRSS"])
           . lines
  stats <- getGCStats
  putStrLn $ "In use according to GC stats: " ++
    show (currentBytesUsed stats `div` (1024 * 1024)) ++ " MB"
  putStrLn $ "HWM according the GC stats: " ++
    show (maxBytesUsed stats `div` (1024 * 1024)) ++ " MB"
  putStrLn ""

sum' :: Num a => [a] -> a
sum' = foldl' (+) 0

Here are the results:

# ghc --make -O -fno-cse -fforce-recomp -rtsopts test.hs
# time ./test +RTS -T
After building the value to encode:
VmHWM:	 2782700 kB
VmRSS:	 2782700 kB
In use according to GC stats: 1320 MB
HWM according the GC stats: 1320 MB

Size of the encoded value: 240 MB

After encoding the value:
VmHWM:	 3064976 kB
VmRSS:	 3064976 kB
In use according to GC stats: 1560 MB
HWM according the GC stats: 1560 MB

After decoding the value:
VmHWM:	 7426784 kB
VmRSS:	 7426784 kB
In use according to GC stats: 2880 MB
HWM according the GC stats: 2880 MB


real	0m24.348s
user	0m22.316s
sys	0m1.992s

At the end of the program the OS reports 7 GB while the GC reports less than 3G of memory in use.

Running the program with +RTS -M3G keeps VmHWM bounded at the expense of doubling the execution time.

Trac metadata
Trac field Value
Version 7.10.2
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system Unknown/Multiple
Architecture Unknown/Multiple
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information