[Memory ordering] Preparation
This is a large-scale refactoring of our use of atomic memory operations (or, rather, fixing the lack therefore) in the runtime system.
Background
In short, prior to C11 the C abstract machine had no notion and threads and therefore guaranteed no particular semantics for shared-memory access. Platform-specific libraries such of pthreads
offered various safe synchronization primitives, but there was no standard mechanism for achieving lock-free concurrent access. Lacking such a mechanism, programs like GHC were forced to use a variety of tricks (e.g. ubiquitous use of volatile
) to try to ensure that the compiler would produce code with the desired operational behavior.
However, these tricks abuse language features unrelated to concurrency (e.g. volatile
was intended for interaction with memory-mapped hardware) and guarantee no particular semantics; data races are undefined behavior, volatile
or otherwise. See http://isvolatileusefulwiththreads.in/C/ and the references therein for details.
Thankfully, this changed with the appearance of C11, which introduced proper standard atomic operations with support for relaxed acquire/release memory ordering semantics. This allows us to explicitly tell the compiler the ordering that we expect, shifting the onus of proving correctness from us (a responsibility which we have historically shirked) to the compiler. Moreover, using standard atomic operations allows us to benefit from the many tools that have been written around this interface (e.g. ThreadSanitizer).
Branch structure
This MR (wip/tsan/prep
) is the basis for all of the work that follows. All of the later branches are independent of one another.
-
wip/tsan/prep
(this MR): Infrastructure and a few simple fixes -
wip/tsan/ci
(!2644 (closed)): Continuous integration infrastructure. -
wip/tsan/sched
(!2643 (closed)): Data race fixes in the scheduler, capability, and task infrastructure. -
wip/tsan/storage
(!2645 (closed)): Data race fixes in the storage manager and garbage collector. -
wip/tsan/wsdeque
(!2646 (closed)): A rewrite of the work-stealing deque. -
wip/tsan/misc
(!2647 (closed)): Miscellaneous data race fixes. -
wip/tsan/stm
(!2648 (closed)): Data race fixes in the STM implementation. -
wip/tsan/event-mgr
(!2649 (closed)): Data race fixes in the event manager. -
wip/tsan/timer
(!2650 (closed)): Data race fixes in the ITimer implementation. -
wip/tsan/stats
(!4342 (merged)): Introduce mutex to protect statistics
Changes
This patch makes a few related changes:
- Use C11-style atomic operations instead of
volatile
and prayer to ensure correct ordering between shared-memory operations. Note that we do not use C11 atomic variables but rather only the atomic operations themselves. This makes it far easier to support older, non-C11 compilers at the expense of less type safety. - Refactor GHC's existing atomic primitives (e.g.
cas
,atomic_inc
) to rather use the__atomic_*
family of builtins as the former as deprecated. - Introduce support for running the ThreadSanitizer memory-concurrency validator which can identify data races and other concurrency issues.
- Fix the issues that ThreadSanitizer identifies
To avoid directly depending upon the atomic builtins I define a set of macros exposing load and store operations with various orderings:
{SEQ_CST,ACQUIRE,RELEASE}_LOAD
{SEQ_CST,ACQUIRE,RELEASE}_STORE
SEQ_CST_ADD
{SEQ_CST,RELEASE}_FENCE
The __atomic_*
builtins that these macros wrap are available in all GCC version since 4.7, released in 2012. To avoid having to emulate these operations on GCC 4.6 (the first GCC version we previously supported), I have bumped the configure
check to mandate gcc >= 4.7. Previously we claimed compatibility with GCC >= 4.6, which was released in 2012
In addition, TSAN itself provides a few interfaces for annotating "benign" races, as well as the rare cases in which the checker cannot infer a "happens-before" relation.
ThreadSanitizer support
This branch adds a new Hadrian build flavour, thread-sanitizer
, as well as a new nightly CI job, nightly-x86_64-linux-deb9-tsan
, which validates in this flavour.
We also add a suppression file to ignore two races which I found were either benign or
All of this is summarised in Note [ThreadSanitizer] in includes/rts/TSANUtils.h
.
Concurrency issues fixed
On the whole I was quite impressed by both the specificity and accuracy of the errors reported by ThreadSanitizer; this will be an invaluable tool moving forward. Among the bugs that it caught are:
- a race in the STM implementation which I believe results in #17757 (closed)
- a race in
setNumCapabilities
anditimer
which could result in use-after-free (#17289) - several thread leaks in the
hs_try_putmvar*
tests - several data races in the tracking of runtime statistics (#18717 and others)
- a data race in the resizing of the stable pointer table
Closes #17090 (closed).
Testing
For future reference, to test this patch-set I use their consolidated merge:
branches=( wip/tsan/prep wip/tsan/ci
wip/tsan/sched wip/tsan/storage
wip/tsan/wsdeque wip/tsan/misc
wip/tsan/stm wip/tsan/event-mgr wip/tsan/timer )
git merge ${branches[@]/#/origin/}
The testsuite can then be run with:
export TSAN_OPTIONS="suppressions=$(pwd)/rts/.tsan-suppressions"
hadrian/build-cabal --flavour=thread-sanitizer test
Todo
-
Look into suppression of #17289; is it fixed?