rts: LDV profiling fixes
rts: Do not traverse nursery for dead closures in LDV profile
It is important that heapCensus
and LdvCensusForDead
traverse the
same areas.
heapCensus
increases the not_used
counter which tracks how many
closures are live but haven't been used yet.
LdvCensusForDead
increases the void_total
counter which tracks how
many dead closures there are.
The LAG
is then calculated by substracting the void_total
from
not_used
and so it is essential that not_used >= void_total
. This
fact is checked by quite a few assertions.
However, if a program has low maximum residency but allocates a lot in
the nursery then these assertions were failing (see #16753 (closed) and #15903 (closed))
because LdvCensusForDead
was observing dead closures from the nursery
which totalled more than the not_used
. The same closures were not
counted by heapCensus
.
Therefore, it seems that the correct fix is to make LdvCensusForDead
agree with heapCensus
and not traverse the nursery for dead closures.
rts: Correct assertion in LDV_recordDead
It is possible that void_total is exactly equal to not_used and the other assertions for this check for <= rather than <.
Correct handling of LARGE ARR_WORDS in LDV profiler
This implements the correct fix for #11627 (closed) by skipping over the slop (which is zeroed) rather than adding special case logic for LARGE ARR_WORDS which runs the risk of not performing a correct census by ignoring any subsequent blocks.
This approach implements similar logic to that in Sanity.c