GHC is nondeterministic wrt. bytes allocated
@mpickering and @osa1 brought my attention to the following nuisance. Save the following file as Simple.hs
:
module Simple where
foo = ()
and compile it, multiple times:
$ # set GHC_ENVIRONMENT to stop GHC from messing with environment files, use -V0 to rule out GC nondeterminism
$ GHC_ENVIRONMENT=- ghc Simple.hs -fforce-recomp +RTS -s -V0
[1 of 1] Compiling Simple ( Simple.hs, Simple.o )
38,612,600 bytes allocated in the heap
23,839,128 bytes copied during GC
5,780,144 bytes maximum residency (5 sample(s))
...
$ GHC_ENVIRONMENT=- ghc Simple.hs -fforce-recomp +RTS -s -V0
[1 of 1] Compiling Simple ( Simple.hs, Simple.o )
37,505,496 bytes allocated in the heap
23,928,368 bytes copied during GC
5,774,512 bytes maximum residency (5 sample(s))
...
$ GHC_ENVIRONMENT=- ghc Simple.hs -fforce-recomp +RTS -s -V0
[1 of 1] Compiling Simple ( Simple.hs, Simple.o )
37,513,808 bytes allocated in the heap
23,929,832 bytes copied during GC
5,774,512 bytes maximum residency (5 sample(s))
...
Note that
- The first compilation allocates 1.1MB (almost 3%!) more than the subsequent ones, although we pass
-fforce-recomp
. I can live with that, though. I suspect it's just that we don't respect-fforce-recomp
everywhere we should. - There is a spread of 8kB in total heap allocations between different runs. This is what's bugging me. I'd expect this number to be completely stable between successive runs. On the other hand it's far less than permille, so won't exactly influence our performance tests.
- Bytes copied during GC is similarly affected, but much less so. Maximum residency was completely stable across all of my measurements.
Some further observations:
- If you pass
-fno-code
, the flukes go away. It's completely deterministic. - If you pass
-fno-code -fwrite-interface
, the flukes are there again, but only with a spread of +/- 32 bytes.
Going from that last observation, I was able to still reproduce it on a stage2 compiler with the following diff:
diff --git a/compiler/main/HscMain.hs b/compiler/main/HscMain.hs
index ffb9b3ced9..454b9db293 100644
--- a/compiler/main/HscMain.hs
+++ b/compiler/main/HscMain.hs
@@ -169,7 +169,7 @@ import Control.Monad
import Data.IORef
import System.FilePath as FilePath
import System.Directory
-import System.IO (fixIO)
+import System.IO
import qualified Data.Map as M
import qualified Data.Set as S
import Data.Set (Set)
@@ -869,8 +869,10 @@ hscMaybeWriteIface dflags iface old_iface location = do
_ -> True
no_change = old_iface == Just (mi_iface_hash (mi_final_exts iface))
- when (write_interface || force_write_interface) $
- hscWriteIface dflags iface no_change location
+ when (write_interface || force_write_interface) $ do
+ h <- openBinaryFile "Simple.hi" WriteMode
+ -- hscWriteIface dflags iface no_change location
+ return ()
--------------------------------------------------------------
-- NoRecomp handlers
So opening a file in write mode leads to these wibbles. If you open the file in ReadMode
, the fluctuations go away.
I compiled a standalone program that simply opened "Simple.hi" in WriteMode
:
import System.IO
main = do
_ <- openBinaryFile "Simple.hi" WriteMode
return ()
(invokation: ghc test.hs -threaded
) But that multiple runs of the program don't exhibit any wibbles in allocations.
At this point I gave up, because I suspect it's some race inside the event manager, but this ticket shall preserve my findings so that my work wasn't in vain.