Skip to content

Poor enforcement for importing implicit dependencies

There are several known-key things that GHC may try to look up the interfaces for during compilation regardless of whether they have been imported. This problematic while bootstrapping, because if we compile a module that tries to refer to GHC.CString.unpackCString# for string-literal desugaring before GHC.CString has been built, compilation will fail.

Two of these potential implicit imports are described in Note [Depend on GHC.Num.Integer] and Note [Depend on GHC.Tuple] in GHC.Internal.Base, along with our current strategy for managing this problem: Whenever a module X implicitly depends on some known-key thing defined in module Y, the transitive imports of X should include Y. But compliance with this requirement has been inconsistent in practice, leading to issues like #23942 (closed) and intermittent CI failures like this one or this one.

I do not put blame on the patch authors and reviewers who have recently introduced modules breaking this requirement and thus causing intermittent build errors for everyone: this requirement is obscure and easy to forget. We need an enforcement mechanism, so that mistakes in implicit-dependency imports can be consistently detected before they get merged.

Solution ideas

  1. We could try to explicitly record these implicit dependencies in GHC's down-sweep. This is arguably the correct thing to do. To do this perfectly, we must first identify all potential sources of implicit dependencies before this becomes reliable, and that seems a bit tricky for now.

  2. We could add a flag to GHC that causes it to log when-ever it actually tries to read an interface file, then check that the logs generated while building stage 2 are compatible with the dependencies graph used to order building.

  3. I am doing some related work in !12179 (closed), and the way I smoke-tested my changes to implicit-dependency imports in that patch was by:

    • Building a stage-2 compiler in _build,
    • Reading the various _build/stage1/**/.dependencies files to see what hadrian thinks the dependencies are,
    • For each output (*.o or *.o-boot) file X that does not transitively depend on GHC.Internal.Base:
      • Delete the various *.o* and *.hi* files in _build/stage1/.
      • Instruct hadrian to build only X. (If X has an un-tracked implicit dependency, this should fail.)

    Why limit the search outputs that don't depend on GHC.Internal.Base? Because all known sources of implicit dependencies either

    • ...are in the transitive imports of GHC.Internal, or
    • ...involve DeriveLift and can be fixed by re-structuring the template-haskell library (#22229 (closed)), or
    • ...involve JS foreign imports/exports with the WASM back-end. (Perhaps this is specific enough to handle with approach 1?)
    • ...are not likely to ever be problematic, like using arrow-syntax without transitively importing the Arrow typeclass. (It is very hard to solve the Arrow constraints this produces when the class and all of its instances are out of scope.)

    Since there are only a few dozen modules and boot-modules that do not depend on GHC.Internal.Base, I believe the cost of this validation approach should be affordable. And I think it would be relatively easy for one of the CI gurus to automate this approach. But it's definitely a bit of an ugly hack. Option 2 seems much better in the long run. (The smoke-test was still valuable; doing this caught a couple of issues I missed in !12179 (closed).)

Edited by Matthew Craven
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information