Skip to content

Making template-haskell refactorable

Background

The template-haskell package is a wired-in package that contains some known-key names. These are identifiers and data constructors that the compiler expects to be in specific locations in the template-haskell package when it desugars things like quotes or derived Lift instances. (Or indeed for recognizing the Lift type class as one that has bespoke deriving logic at all!)

At the moment, the ghc package and some of its dependencies (bytestring, containers, exceptions, filepath and text) depend on the template-haskell package. More specifically the compiler itself ghc depends on its own in-tree version of template-haskell, while boot libraries like bytestring contain derived Lift instances and quotes.

Problem

Now consider what happens when we build a stage-1 ghc, with some bootstrap version of ghc:

  1. We must build the in-tree version of template-haskell (with the boot compiler) because the ghc package being build depends on that in-tree version.
  2. We compile a derived Lift instance in one of the boot libraries like bytestring, the bootstrap compiler will only accept the derived instance if the class itself lives at the specific wired-in location template-haskell:Language.Haskell.TH.Syntax.Lift. Furthermore, the body of the derived instance will refer to various known-key names in template-haskell:Language.Haskell.TH.Lib.Internal and elsewhere.

This makes it difficult to make changes to Language.Haskell.TH.Syntax, which defines many wired-in names used by TH: If we move or change one, in step 2 the boot compiler will still try to look it up at its old location and use it as if it had its old type. See for example #20828 and #22229 (closed) which have run into this issue.

Possible solutions

  1. One possible fix would be to make it possible to compile these packages without depending on template-haskell (e.g. using a Cabal flag and CPP). Then we could build stage-1 GHC without the boot compiler ever touching any template-haskell package at all, and hence it would be possible to move definitions around. The stage-2 GHC build would then depend on template-haskell as normal.
    • This would require changes to several boot libraries, and although their dependencies on template-haskell are fairly trivial, the relevant maintainers may not all be happy about expanding the configuration spaces for these packages to accommodate this approach.
  2. Another possibility is to build a stage-1 ghc against the version of template-haskell shipped with the bootstrap compiler instead of the in-tree template-haskell. Then in step 2 above, the lookups by the bootstrap compiler will succeed because the bootstrap compiler performs these lookup based on where things are in its own version of template-haskell.
    • This means that the compiler must be buildable with multiple versions of template-haskell, most likely through some awkward CPP.
  3. A third possibility is to build the various boot libraries for the stage-1 ghc against the template-haskell shipped with the bootstrap compiler, but make the stage-1 ghc itself directly depend on another package that provides the same API as the in-tree version of template-haskell. This package's implementation could consist of a .cabal file and a hadrian-created symlink into the in-tree template-haskell's sources.
    • This introduces a bit of complexity into hadrian, but requires no CPP anywhere.

Any of these would make it easier to refactor Language.Haskell.TH.Syntax, which is currently rather unwieldy and could do with API cleanup.

Edited by Matthew Craven
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information