setNumCapabilities can cause threads to get stuck in gcWorkerThread
I have a patch with some instrumentation (Phab:D4339) that proves that sometimes threads do not leave gcWorkerThread until the following gc.
I suspect it's caused by idle_caps
being mutated in scheduleDoGC
after the call to requestSync
. A thread enters yieldCapability
sees that itself is not idle, so enters gcWorkerThread
, but then idle_caps
is mutated so that that thread is idle, and it's spin locks are not touched by the garbage collector.
Potential fixes:
- Don't look at
idle_caps
in the garbage collector when we're touching the spin-locks, just do it for all capabilities. I don't think this does any harm. - Don't mutate
idle_caps
after the call torequestSync
; move that logic to before the call.
Of course, maybe I'm misunderstanding and this isn't a bug?