genSym is not thread safe with respect to setNumCapabilities
In a large proprietary application using the GHC API, we observe really weird errors (e.g. overlapping instances for Eq Foo
and Eq Bar
, where Foo
and Bar
are completely unrelated, and come from different modules). The pattern we follow is:
* Running with the threaded RTS, 1 initial thread
* Create a new unique supply with mkSplitUniqSupply
and put it in an MVar
.
* Repeating many times:
-
Set the thread count higher (e.g. 8) using
setNumCapabilities
-
On many threads in parallel:
* Obtain a new unique supply on the original with
splitUniqSupply
, protected by theMVar
, and update the other one in theMVar
* Use that unique supply to interact with the GHC API
-
Set the thread count back to 1
Our observations of the errors are best explained by the unique names not being nearly as unique as they might be expected to be. Reading the code for genSym
:
if (n_capabilities == 1)
{
GenSymCounter = (GenSymCounter + GenSymInc) & UNIQUE_MASK;
checkUniqueRange(GenSymCounter);
return GenSymCounter;
}
else
{
HsInt n = atomic_inc((StgWord *)&GenSymCounter, GenSymInc) & UNIQUE_MASK;
checkUniqueRange(n);
return n;
}
It only does an atomic_inc
if n_capabilities == 1
, but it doesn't read n_capabilities
atomically, so is it suffering a race?
The solution was to set the thread count initially, before any interactions with the GHC API, which seems to solve the problem. Alas, we don't have a reproducible test case, and in fact were unable to reproduce it anywhere but our Linux CI, and even then non-deterministically. The problem does not currently impact us (the workaround is robust), but it seemed worth sharing.
Trac metadata
Trac field | Value |
---|---|
Version | 8.6.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |