GHC API allocates memory which is never GC'd
I've run into a very strange sort of memory leak it seems. GHC seems to manage to allocate memory which I cannot get rid of no matter what I try even after all references have disappeared. The allocated memory doesn't show up in heap profiles but is still visibly allocated (as seen in ps or pmap).
I discovered this while investigating reports of large memory usage in ghc-mod where this issue is forcing us to dump completion information in a separate process to avoid the allocated memory sticking around for the rest of the user's session.
The attached test case demonstrates this problem by first calling getModuleInfo
for all modules in all visible packages and then just looping forever in main
. I would expect all memory GHC allocated to be GC'd at this point but this does not happen.
When run as
$ ./Leaky -hide-all-packages
the test case consumes around 30M of memory on my system, if we however load some large package, say GHC itself
$ ./Leaky -hide-all-packages -package ghc
the program will consume around 2-300M of memory which is never deallocated. The behaviour is not related to the loaded package though I tried it with Cabal
too with the same result though smaller memory usage.
At first I thought this might be related to large CAFs with IORefs inside not being GCd for some reason so I tried calling resetCAFs
from the RTS before entering the loop. This however makes no difference whatsoever. I also tried forcing a major GC just to be save to no avail.
Next I speculated that maybe the allocated memory is beyond the GC's control (malloc()ed or something) but looking at the memory map of the process with pmap $(pgrep Leaky)
I can see that the majority of the memory usage I'm seeing comes from address 0x0000000200000000
which is where the RTS allocates memory AFAIK.