Heap census and shrunk array
Summary
In a specific fork of GHC I get one test failing (space_leak_001
) with a segfault. It only fails when +RTS -h
is passed. I think it is not related directly to my changes. It seems related to #9666 (closed).
Steps to reproduce
- clone https://gitlab.haskell.org/hsyl20/ghc/tree/hsyl20-bignum
- build with Hadrian (
make
is broken in this branch):./hadrian/build.stack.sh -c -j --flavour=perf test --only=space_leak_001
Wrong exit code for space_leak_001(normal)(expected 0 , actual 139 )
*** unexpected failure for space_leak_001(normal)
Expected behavior
Don't crash.
Debugging
My current guess is that ByteArrays that have been shrunk are not correctly handled by heapCensusChain
in rts/ProfHeap.c
. At the beginning of this function we find:
// HACK: pretend a pinned block is just one big ARR_WORDS
// owned by CCS_PINNED. These blocks can be full of holes due
// to alignment constraints so we can't traverse the memory
// and do a proper census.
It seems to me that we don't handle shrunk arrays that are not pinned to avoid traversing garbage data.
Debugging this function with GDB, the first closure in the block is an ARR_WORDS
which (I think) has been shrunk of 8 bytes:
(gdb) p (StgArrBytes*)p
$6 = (StgArrBytes *) 0x42002ac000
(gdb) p *(StgArrBytes*)p
$7 = {header = {info = 0x77dd40 <stg_ARR_WORDS_info>}, bytes = 56976, payload = 0x42002ac010}
The next computed closure address is 0x42002b9ea0 while bd->free
is 0x42002b9ea8 (i.e. 8 bytes, smaller than StgClosure
). The info pointer read is garbage and it crashes.
A workaround is to fill shrunk cells with 0. It works because we have this code at the end of the loop:
/* skip over slop */
while (p < bd->free && !*p) p++; // skip slop