Potential memory ordering issue in checkBlockingQueues
We have seen a few concerning crashes ($3085) on the AArch64 Darwin runners. They appear to share a common thread, all crashing in checkBlockingQueues
, called by stg_marked_upd_frame
. Looking at the code, I suspect we are missing a barrier in this area.
The problem appears to be that due to CurrentTSO->bq
, which is only modified in a two places:
- in the
messageBlackHole
logic, where we add ourselves to the blocking queue of the blackhole'd thunk. This appears to have the correct release barrier on the update totso->bq
but doesn't appear to synchronize on the read fromowner->bq
. - in
createThread
during thread creation (unlikely to be problematic)
Given that Messages.c
shows up in a few of the backtraces, it seems likely that (1) is the culprit.