Skip to content

nonmoving gc: Fix #18587

Ben Gamari requested to merge wip/nonmoving-fixes into master

Here we fix a soundness issue resulting from a bug in the non-moving collector's deadlock-detection logic. In general, when we encounter a reference from a young generation into an old generation we push the old object to the mark queue such that the object is not swept during the concurrent collection (since we will not see the young reference during the concurrent collection). However, when we suspect a deadlock we promote all live data into the non-moving generation such that the collector can precisely determine liveness (which we cannot safely do with concurrent collection when data is split across generations) and do not push references into the old generation (since we know that all data will be visible to the collector).

However, the logic responsible for ensuring that data was promoted was buggy: it failed to promote in some circumstances due to an attempted optimisation (avoiding an unnecessary branch). This meant that some data would remain in the young generation and could hold un-traced references into the old generation. The solution was to refactor alloc_for_copy to eliminate the subtly from the promotion decision.

However, fixing this bug then revealed another one: large objects would always be pushed to the mark queue, even during deadlock detection GCs, hiding deadlocks. This is easily fixed by adding the missing predicate.

Finally, we also add some comments and some assertions for the deadlock-promotion invariant, and fix a buglet in the spin implementation during stack marking.

Merge request reports