GHC is nondeterministic wrt. bytes allocated
@mpickering and @osa1 brought my attention to the following nuisance. Save the following file as
module Simple where foo = ()
and compile it, multiple times:
$ # set GHC_ENVIRONMENT to stop GHC from messing with environment files, use -V0 to rule out GC nondeterminism $ GHC_ENVIRONMENT=- ghc Simple.hs -fforce-recomp +RTS -s -V0 [1 of 1] Compiling Simple ( Simple.hs, Simple.o ) 38,612,600 bytes allocated in the heap 23,839,128 bytes copied during GC 5,780,144 bytes maximum residency (5 sample(s)) ... $ GHC_ENVIRONMENT=- ghc Simple.hs -fforce-recomp +RTS -s -V0 [1 of 1] Compiling Simple ( Simple.hs, Simple.o ) 37,505,496 bytes allocated in the heap 23,928,368 bytes copied during GC 5,774,512 bytes maximum residency (5 sample(s)) ... $ GHC_ENVIRONMENT=- ghc Simple.hs -fforce-recomp +RTS -s -V0 [1 of 1] Compiling Simple ( Simple.hs, Simple.o ) 37,513,808 bytes allocated in the heap 23,929,832 bytes copied during GC 5,774,512 bytes maximum residency (5 sample(s)) ...
- The first compilation allocates 1.1MB (almost 3%!) more than the subsequent ones, although we pass
-fforce-recomp. I can live with that, though. I suspect it's just that we don't respect
-fforce-recompeverywhere we should.
- There is a spread of 8kB in total heap allocations between different runs. This is what's bugging me. I'd expect this number to be completely stable between successive runs. On the other hand it's far less than permille, so won't exactly influence our performance tests.
- Bytes copied during GC is similarly affected, but much less so. Maximum residency was completely stable across all of my measurements.
Some further observations:
- If you pass
-fno-code, the flukes go away. It's completely deterministic.
- If you pass
-fno-code -fwrite-interface, the flukes are there again, but only with a spread of +/- 32 bytes.
Going from that last observation, I was able to still reproduce it on a stage2 compiler with the following diff:
diff --git a/compiler/main/HscMain.hs b/compiler/main/HscMain.hs index ffb9b3ced9..454b9db293 100644 --- a/compiler/main/HscMain.hs +++ b/compiler/main/HscMain.hs @@ -169,7 +169,7 @@ import Control.Monad import Data.IORef import System.FilePath as FilePath import System.Directory -import System.IO (fixIO) +import System.IO import qualified Data.Map as M import qualified Data.Set as S import Data.Set (Set) @@ -869,8 +869,10 @@ hscMaybeWriteIface dflags iface old_iface location = do _ -> True no_change = old_iface == Just (mi_iface_hash (mi_final_exts iface)) - when (write_interface || force_write_interface) $ - hscWriteIface dflags iface no_change location + when (write_interface || force_write_interface) $ do + h <- openBinaryFile "Simple.hi" WriteMode + -- hscWriteIface dflags iface no_change location + return () -------------------------------------------------------------- -- NoRecomp handlers
So opening a file in write mode leads to these wibbles. If you open the file in
ReadMode, the fluctuations go away.
I compiled a standalone program that simply opened "Simple.hi" in
import System.IO main = do _ <- openBinaryFile "Simple.hi" WriteMode return ()
ghc test.hs -threaded) But that multiple runs of the program don't exhibit any wibbles in allocations.
At this point I gave up, because I suspect it's some race inside the event manager, but this ticket shall preserve my findings so that my work wasn't in vain.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information