Skip to content

Memory leak

With the attached write.hs (which generates a file called "data") and read.hs (which consumes that file), if I run

ghc -Wall -Werror -O write.hs -o write
ghc -Wall -Werror -O -prof -auto-all read.hs -o read1
ghc -Wall -Werror -O -prof -auto-all read.hs -o read2 -DLEAK
./write
./read1 +RTS -h
./read2 +RTS -h

then read2's heap profile shows that it is retaining a lot of extra data. Perhaps I am missing something, but I can't see why this needs to be retained. I would expect the two heap profiles to look the same.

Sources and heap profiles (using GHC 8.0.1) attached. I've copied the sources below for convenience:

write.hs:

module Main (main) where

import qualified Data.ByteString.Lazy.Char8 as L

main :: IO ()
main = L.writeFile "data" $ L.concat $ map mkByteString [1..100000]

mkByteString :: Int -> L.ByteString
mkByteString i = L.concat (L.pack ("#" ++ show i ++ "\n")
               : replicate 100 (L.pack "Something else\n"))

read.hs:

{-# LANGUAGE BangPatterns, CPP #-}

module Main (main) where

import Data.List
import Data.Set (Set)
import qualified Data.Set as Set

import qualified Data.ByteString.Char8 as S
import qualified Data.ByteString.Lazy.Char8 as L

main :: IO ()
main = do bs <- L.readFile "data"
          let stats = getMaybes bs
              s = mkSet stats
          print $ Set.size s

mkSet :: [Maybe S.ByteString] -> Set S.ByteString
mkSet ms = foldl' f Set.empty ms
    where f s (Just l) = Set.insert l s
          f s _ = s

getMaybes :: L.ByteString -> [Maybe S.ByteString]
getMaybes bs = if L.null bs then []
               else case getMaybe bs of
                    (stat, bs') ->
                        stat : getMaybes bs'

getMaybe :: L.ByteString -> (Maybe S.ByteString, L.ByteString)
getMaybe bs = case L.uncons bs of
              Just ('#', bs') ->
                  case L.break ('\n' ==) bs' of
                  (l, bs'') ->
                      let !l' = copy l
                      in (Just l', bs'')
              _ ->
                  case L.break ('\n' ==) bs of
                  (_x, bs') ->
#ifdef LEAK
                      copy _x `seq`
#endif
                      (Nothing, L.tail bs')

copy :: L.ByteString -> S.ByteString
copy bs = S.copy $ L.toStrict bs
Trac metadata
Trac field Value
Version 8.0.1
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information