Skip to content

Potential memory ordering issue in checkBlockingQueues

We have seen a few concerning crashes ($3085) on the AArch64 Darwin runners. They appear to share a common thread, all crashing in checkBlockingQueues, called by stg_marked_upd_frame. Looking at the code, I suspect we are missing a barrier in this area.

The problem appears to be that due to CurrentTSO->bq, which is only modified in a two places:

  1. in the messageBlackHole logic, where we add ourselves to the blocking queue of the blackhole'd thunk. This appears to have the correct release barrier on the update to tso->bq but doesn't appear to synchronize on the read from owner->bq.
  2. in createThread during thread creation (unlikely to be problematic)

Given that Messages.c shows up in a few of the backtraces, it seems likely that (1) is the culprit.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information