Skip to content

[Memory ordering] Preparation

Ben Gamari requested to merge wip/tsan/prep into master

This is a large-scale refactoring of our use of atomic memory operations (or, rather, fixing the lack therefore) in the runtime system.

Background

In short, prior to C11 the C abstract machine had no notion and threads and therefore guaranteed no particular semantics for shared-memory access. Platform-specific libraries such of pthreads offered various safe synchronization primitives, but there was no standard mechanism for achieving lock-free concurrent access. Lacking such a mechanism, programs like GHC were forced to use a variety of tricks (e.g. ubiquitous use of volatile) to try to ensure that the compiler would produce code with the desired operational behavior.

However, these tricks abuse language features unrelated to concurrency (e.g. volatile was intended for interaction with memory-mapped hardware) and guarantee no particular semantics; data races are undefined behavior, volatile or otherwise. See http://isvolatileusefulwiththreads.in/C/ and the references therein for details.

Thankfully, this changed with the appearance of C11, which introduced proper standard atomic operations with support for relaxed acquire/release memory ordering semantics. This allows us to explicitly tell the compiler the ordering that we expect, shifting the onus of proving correctness from us (a responsibility which we have historically shirked) to the compiler. Moreover, using standard atomic operations allows us to benefit from the many tools that have been written around this interface (e.g. ThreadSanitizer).

Branch structure

This MR (wip/tsan/prep) is the basis for all of the work that follows. All of the later branches are independent of one another.

  • wip/tsan/prep (this MR): Infrastructure and a few simple fixes
  • wip/tsan/ci (!2644 (closed)): Continuous integration infrastructure.
  • wip/tsan/sched (!2643 (closed)): Data race fixes in the scheduler, capability, and task infrastructure.
  • wip/tsan/storage (!2645 (closed)): Data race fixes in the storage manager and garbage collector.
  • wip/tsan/wsdeque (!2646 (closed)): A rewrite of the work-stealing deque.
  • wip/tsan/misc (!2647 (closed)): Miscellaneous data race fixes.
  • wip/tsan/stm (!2648 (closed)): Data race fixes in the STM implementation.
  • wip/tsan/event-mgr (!2649 (closed)): Data race fixes in the event manager.
  • wip/tsan/timer (!2650 (closed)): Data race fixes in the ITimer implementation.
  • wip/tsan/stats (!4342 (merged)): Introduce mutex to protect statistics

Changes

This patch makes a few related changes:

  • Use C11-style atomic operations instead of volatile and prayer to ensure correct ordering between shared-memory operations. Note that we do not use C11 atomic variables but rather only the atomic operations themselves. This makes it far easier to support older, non-C11 compilers at the expense of less type safety.
  • Refactor GHC's existing atomic primitives (e.g. cas, atomic_inc) to rather use the __atomic_* family of builtins as the former as deprecated.
  • Introduce support for running the ThreadSanitizer memory-concurrency validator which can identify data races and other concurrency issues.
  • Fix the issues that ThreadSanitizer identifies

To avoid directly depending upon the atomic builtins I define a set of macros exposing load and store operations with various orderings:

  • {SEQ_CST,ACQUIRE,RELEASE}_LOAD
  • {SEQ_CST,ACQUIRE,RELEASE}_STORE
  • SEQ_CST_ADD
  • {SEQ_CST,RELEASE}_FENCE

The __atomic_* builtins that these macros wrap are available in all GCC version since 4.7, released in 2012. To avoid having to emulate these operations on GCC 4.6 (the first GCC version we previously supported), I have bumped the configure check to mandate gcc >= 4.7. Previously we claimed compatibility with GCC >= 4.6, which was released in 2012 In addition, TSAN itself provides a few interfaces for annotating "benign" races, as well as the rare cases in which the checker cannot infer a "happens-before" relation.

ThreadSanitizer support

This branch adds a new Hadrian build flavour, thread-sanitizer, as well as a new nightly CI job, nightly-x86_64-linux-deb9-tsan, which validates in this flavour.

We also add a suppression file to ignore two races which I found were either benign or

All of this is summarised in Note [ThreadSanitizer] in includes/rts/TSANUtils.h.

Concurrency issues fixed

On the whole I was quite impressed by both the specificity and accuracy of the errors reported by ThreadSanitizer; this will be an invaluable tool moving forward. Among the bugs that it caught are:

  • a race in the STM implementation which I believe results in #17757 (closed)
  • a race in setNumCapabilities and itimer which could result in use-after-free (#17289)
  • several thread leaks in the hs_try_putmvar* tests
  • several data races in the tracking of runtime statistics (#18717 and others)
  • a data race in the resizing of the stable pointer table

Closes #17090 (closed).

Testing

For future reference, to test this patch-set I use their consolidated merge:

branches=( wip/tsan/prep wip/tsan/ci
      wip/tsan/sched wip/tsan/storage
      wip/tsan/wsdeque wip/tsan/misc
      wip/tsan/stm wip/tsan/event-mgr wip/tsan/timer )
git merge ${branches[@]/#/origin/}

The testsuite can then be run with:

export TSAN_OPTIONS="suppressions=$(pwd)/rts/.tsan-suppressions"
hadrian/build-cabal --flavour=thread-sanitizer test

Todo

  • Look into suppression of #17289; is it fixed?
Edited by Ben Gamari

Merge request reports