Skip to content

Memory fence on writes to MutVar/Array missing on ARM

The memory model question has been debated now and again. This thread from ten years back (https://mail.haskell.org/pipermail/haskell-prime/2006-April/001237.html) lays out the basic situation with thunk update, writeIORef, and memory fences.

But we recently began experimenting with GHC on ARM platforms, and it seems to lack a memory fence that the participants in the cited thread expect it to have.

Here's an attempt to construct a program which writes fields of a data structure, and then writes the pointer to that structure to an IORef, without the proper fence inbetween:

import Data.IORef
import Control.Concurrent

data Foo = Foo Int deriving Show

{-# NOINLINE mkfoo #-}
mkfoo x = Foo x

{-# NOINLINE dowrite #-}
dowrite r n = writeIORef r $! mkfoo n

main = 
  do r <- newIORef (Foo 3)
     forkIO (dowrite r 4)
     x <- readIORef r
     print x

Here are the relevant bits of the CMM that results when compiled on an ARM 64 machine:

mkfoo_rn1_entry() //  []
        { []
        }
    {offset
      c40i:
          P64[MainCapability+872] = P64[MainCapability+872] + 16;
          if (P64[MainCapability+872] > I64[MainCapability+880]) goto c40m; else goto c40l;
      c40m:
          I64[MainCapability+928] = 16;
          P64[MainCapability+24] = mkfoo_rn1_closure;
          call (I64[MainCapability+16])(R1) args: 16, res: 0, upd: 8;
      c40l:
          I64[P64[MainCapability+872] - 8] = Foo_con_info;
          P64[P64[MainCapability+872]] = P64[I64[MainCapability+856]];
          P64[MainCapability+24] = P64[MainCapability+872] - 7;
          I64[MainCapability+856] = I64[MainCapability+856] + 8;
          call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8;
    }
}

dowrite_entry() //  []
        { []
        }
    {offset
      c44j:
          call a_r3Dy_entry() args: 24, res: 0, upd: 8;
    }
}

a_r3Dy_entry() //  [R1]
        { []
        }
    {offset
      c41D:
          if (I64[MainCapability+856] - 16 < I64[MainCapability+864]) goto c41H; else goto c41I;
      c41H:
          P64[MainCapability+24] = a_r3Dy_closure;
          call (I64[MainCapability+16])(R1) args: 24, res: 0, upd: 8;
      c41I:
          I64[I64[MainCapability+856] - 8] = block_c41B_info;
          P64[I64[MainCapability+856] - 16] = P64[I64[MainCapability+856] + 8];
          I64[MainCapability+856] = I64[MainCapability+856] - 16;
          call mkfoo_rn1_entry() args: 16, res: 8, upd: 8;
    }
}

block_c41B_entry() //  [R1]
        { []
        }
    {offset
      c41B:
          _s3Ep::P64 = P64[I64[MainCapability+856] + 8];
          I64[I64[MainCapability+856] + 8] = block_c41G_info;
          _s3Es::P64 = P64[MainCapability+24];
          P64[MainCapability+24] = _s3Ep::P64;
          P64[I64[MainCapability+856] + 16] = _s3Es::P64;
          I64[MainCapability+856] = I64[MainCapability+856] + 8;
          if (P64[MainCapability+24] & 7 != 0) goto u41S; else goto c41K;
      u41S:
          call block_c41G_entry(R1) args: 0, res: 0, upd: 0;
      c41K:
          call (I64[I64[P64[MainCapability+24]]])(R1) args: 8, res: 8, upd: 8;
    }
}

block_c41G_entry() //  [R1]
        { []
        }
    {offset
      c41G:
          _s3Ev::P64 = P64[P64[MainCapability+24] + 7];
          P64[_s3Ev::P64 + 8] = P64[I64[MainCapability+856] + 8];
          call "ccall" arg hints:  [PtrHint,
                                    PtrHint]  result hints:  [] dirty_MUT_VAR(MainCapability+24, _s3Ev::P64);
          P64[MainCapability+24] = ()_closure+1;
          I64[MainCapability+856] = I64[MainCapability+856] + 16;
          call (I64[P64[I64[MainCapability+856]]])(R1) args: 8, res: 0, upd: 8;
    }
}

The fence should happen before the write of the pointer into the IORef. I can't find the fence, and can't find a codepath in the compiler that would insert it (i.e. with MO_WriteBarrier).

dirty_MUT_VAR is actually too late to perform the fence, but it doesn't either:

void
dirty_MUT_VAR(StgRegTable *reg, StgClosure *p)
{
    Capability *cap = regTableToCapability(reg);
    if (p->header.info == &stg_MUT_VAR_CLEAN_info) {
        p->header.info = &stg_MUT_VAR_DIRTY_info;
        recordClosureMutated(cap,p);
    }
}

(Neither does recordClosureMutated.)

Edited by rrnewton
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information