Correct closure observation, construction, and mutation on weak memory machines.

Here the following changes are introduced:
    - A read barrier machine op is added to Cmm.
    - The order in which a closure's fields are read and written is changed.
    - Memory barriers are added to RTS code to ensure correctness on
      out-or-order machines with weak memory ordering.

Cmm has a new CallishMachOp called MO_ReadBarrier. On weak memory machines, this
is lowered to an instruction that ensures memory reads that occur after said
instruction in program order are not performed before reads coming before said
instruction in program order. On machines with strong memory ordering properties
(e.g. X86, SPARC in TSO mode) no such instruction is necessary, so
MO_ReadBarrier is simply erased. However, such an instruction is necessary on
weakly ordered machines, e.g. ARM and PowerPC.

Weam memory ordering has consequences for how closures are observed and mutated.
For example, consider a closure that needs to be updated to an indirection. In
order for the indirection to be safe for concurrent observers to enter, said
observers must read the indirection's info table before they read the
indirectee. Furthermore, the entering observer makes assumptions about the
closure based on its info table contents, e.g. an INFO_TYPE of IND imples the
closure has an indirectee pointer that is safe to follow.

When a closure is updated with an indirection, both its info table and its
indirectee must be written. With weak memory ordering, these two writes can be
arbitrarily reordered, and perhaps even interleaved with other threads' reads
and writes (in the absence of memory barrier instructions). Consider this
example of a bad reordering:

- An updater writes to a closure's info table (INFO_TYPE is now IND).
- A concurrent observer branches upon reading the closure's INFO_TYPE as IND.
- A concurrent observer reads the closure's indirectee and enters it. (!!!)
- An updater writes the closure's indirectee.

Here the update to the indirectee comes too late and the concurrent observer has
jumped off into the abyss. Speculative execution can also cause us issues,
consider:

- An observer is about to case on a value in closure's info table.
- The observer speculatively reads one or more of closure's fields.
- An updater writes to closure's info table.
- The observer takes a branch based on the new info table value, but with the
  old closure fields!
- The updater writes to the closure's other fields, but its too late.

Because of these effects, reads and writes to a closure's info table must be
ordered carefully with respect to reads and writes to the closure's other
fields, and memory barriers must be placed to ensure that reads and writes occur
in program order. Specifically, updates to a closure must follow the following
pattern:

- Update the closure's (non-info table) fields.
- Write barrier.
- Update the closure's info table.

Observing a closure's fields must follow the following pattern:

- Read the closure's info pointer.
- Read barrier.
- Read the closure's (non-info table) fields.

This patch updates RTS code to obey this pattern. This should fix long-standing
SMP bugs on ARM (specifically newer aarch64 microarchitectures supporting
out-of-order execution) and PowerPC. This fixesd issue #15449.
19 jobs for !734 with memory-barrier in 165 minutes and 10 seconds (queued for 2 seconds)
latest detached
Status Job ID Name Coverage
  Lint
passed #72311
lint
ghc-linters

00:01:06

passed #72312
lint
lint-submods-mr

00:01:09

 
  Build
passed #72314
x86_64-linux
hadrian-ghc-in-ghci

00:18:11

passed #72313
x86_64-linux
validate-x86_64-linux-deb8-hadrian

02:43:03

failed #72315
x86_64-linux
validate-x86_64-linux-deb9-debug

01:14:01

 
  Full Build
skipped #72317
aarch64-linux allowed to fail
validate-aarch64-linux-deb9
skipped #72318
x86_64-linux
validate-i386-linux-deb9
skipped #72316
x86_64-darwin
validate-x86_64-darwin
skipped #72319
x86_64-linux
validate-x86_64-linux-deb9
skipped #72321
x86_64-linux
validate-x86_64-linux-deb9-integer-simple
skipped #72320
x86_64-linux
validate-x86_64-linux-deb9-llvm
skipped #72322
x86_64-linux
validate-x86_64-linux-deb9-unreg
skipped #72323
x86_64-linux
validate-x86_64-linux-fedora27
skipped #72325
x86_64-windows allowed to fail
validate-x86_64-windows
skipped #72324
x86_64-windows
validate-x86_64-windows-hadrian
 
  Cleanup
passed #72327
x86_64-darwin
cleanup-darwin

00:00:07

passed #72326
x86_64-windows
cleanup-windows

00:00:48

 
  Packaging
skipped #72328
x86_64-linux
doc-tarball
 
  Hackage
skipped #72329
x86_64-linux allowed to fail manual
hackage
 
Name Stage Failure
failed
validate-x86_64-linux-deb9-debug Build
make[1]: *** [test] Error 1
make: *** [test] Error 2
Makefile:224: recipe for target 'test' failed
Running after script...
$ cp -Rf $HOME/.cabal cabal-cache
Uploading artifacts...
junit.xml: found 1 matching files
Uploading artifacts to coordinator... ok
id=72315 responseStatus=201 Created token=KzZuqQP8
ERROR: Job failed: exit code 1