Does `withAtomicRename` sabotage GHC's ability to run concurrently?
I am experiencing CI glitches in Agda's testsuite, see https://github.com/agda/agda/issues/6739.
renameFile:renamePath:rename MAlonzo/RTE.o.tmp' to MAlonzo/RTE.o': does not exist
E.g. at https://github.com/agda/agda/actions/runs/5645321781/job/15291549031?pr=6738#step:10:42
Tests are run in parallel, and some invoke GHC to compile generated .hs files. These files refer to certain "builtin" modules, e.g. MAlonzo/RTE.hs
. So it could be that GHC is invoked twice in parallel on a compilation job that involves MAlonzo.RTE
and these two compilation jobs interfere with each other.
I found a potential culprit withAtomicRename
in GHC's code base:
https://gitlab.haskell.org/ghc/ghc/-/blob/73b5c7ce33929e1f7c9283ed7c2860aa40f6d0ec/compiler/GHC/Utils/Misc.hs#L1265
This could be the problematic invocation: The call to the C compiler which generates the .o
file via .o.tmp
:
https://gitlab.haskell.org/ghc/ghc/-/blob/73b5c7ce33929e1f7c9283ed7c2860aa40f6d0ec/compiler/GHC/Driver/Pipeline/Execute.hs#L477-478
Atm, I have no local reproducer, and CI glitches are random, pointing to a race condition.
Maybe one of the GHC developers has a better grip on whether my speculation has any base and whether something has to be done about the GHC pipeline to exclude the potential for such race conditions.
GHC version: 9.6.2 (see https://github.com/agda/agda/actions/runs/5645321781/job/15291549031?pr=6738#step:3:23)