copy_tag_nolock(): fix write ordering and add a write_barrier()
Fixes a rare crash in the parallel GC. If we copy a closure non-atomically during GC, as we do for all immutable values, then before writing the forwarding pointer we better make sure that the closure itself is visible to other threads that might follow the forwarding pointer. I imagine this doesn't happen very often, but I just found one case of it: in scavenge_stack, the RET_FUN case, after evacuating ret_fun->fun we then follow it and look up the info pointer.
Showing with 6 additions and 2 deletions