Making Template Haskell (more) stable across GHC releases
See also
- Concrete plan here
- This thread on ghc-devs
- #23647 (closed) Usage of Template Haskell quotes in GHC source tree vs. usage of GHC as a library
- !12306 (closed) Make template-haskell a stage1 library
- #23536 (closed) Make Template Haskell refactorable
- #20828 Introduce field selectors and simple smart constructors in template-haskell
-
#24703 (closed) Making
template-haskell
reinstallable -
#24704 (closed) Split out a
template-haskell
pretty printing package - Simon's very old blog post about typed TH, mainly of historical interest.
-
#24766 Stop exposing
TH.Lib.Internal
fromtemplate-haskell
The problem
The AST for Template Haskell, in the template-haskell
library, in Langauge.Haskell.TH.Syntax, is tightly coupled to GHC.
When GHC gets a syntactic extension, it is not long before the TH AST needs it too. The problem is that as a result:
- The
template-haskell
library needs a major bump with every GHC release. - Using
--allow-newer
may allow some TH clients to squeeze by, but others will break because they pattern-match on the TH AST, and now have missing constructors, or constructors with the wrong fields.
As a result, churn in the TH AST is a major reason that it is hard to compile old programs unchanged with a new GHC.
A possible approach
One approach that has often been discussed is to define an API for Template Haskell that is pretty stable across GHC releases, and yet is enough for most clients. Example of earlier discussion:
- !11200 (closed) (comment 524484)
- #20828
The idea is to have two distinct packages:
-
One internal package
th-internal
, closely coupled to GHC, supporting exactly and only the AST provided by that GHC version. This is essentially what we have at the moment, but we suggest renaming it to stress its instability. Users would be discouraged from depending on it directly (much like ghc-internal). -
One user-facing package
template-haskell
, defining a stable external API for Template Haskell with a (fairly) stable API, much likebase
. Crucially, the user-facing package would be a normal, reinstallable Hackage package, so its version would not necessarily be tied to the GHC version. As a normal package it would be easier to refactor and clean up the API over time without running into issues like #23536 (closed).
The million dollar question
The million dollar question is, of course: can we design an API that is
- Sufficient for most TH clients.
- Able to be implemented on top of a succession of
th-internal
ASTs, as they evolve.
This is a design challenge, and we would need someone to lead the process of designing it.
It is worth classifying how clients use TH:
-
TH quotations and quasi-quotes are not vulnerable to changes in the Template Haskell data type. Clients who use only quotations and splices need
not even import the
template-haskell
library. - Programs that construct syntax trees can do so using the little smart-contructor functions (e.g.
varE
). These smart constructors can readily form part of the stable API. - Some TH clients, notably various forms of
deriving
, use reification to interrogate the program, and reification currently returns the TH AST; see "Reification" below.
Claim: these three use-cases satisfy most clients. If we could satisfy them with a stable API, that would represent a major step forwards.
Reification
Template Haskell offers
reify :: Name -> Q Info
reifyType :: Name -> Q Type
data Info
= ClassI Dec [InstanceDec]
| ClassOpI Name Type ParentName
| TyConI Dec
| FamilyI Dec [InstanceDec]
| PrimTyConI Name Arity Unlifted
| DataConI Name Type ParentName
| PatSynI Name PatSynType
| VarI Name Type (Maybe Dec)
| TyVarI Name Type
The data type Info
is used exclusively for reification. But (crucially) Type
, Dec
etc are simply the TH AST data types.
That seems logical but it isn't. For example, the TH type Type
looks like
data Type
= ParensT Type
| InfixT Type Name Type -- Infix
| UInfixT Type Name Type -- "Unresolved infix"
| ....
The idea is that Type
can faithfully represent the rich concrete syntax of Haskell types.
But that is not necessary for reification. Indeed, it gets in the way. The deriving
code just wants to know "is this type a type application, and if so, with what argments?". It's a pain having to deal with the plethora of concrete syntax.
So it would be entirely possible for reify to present an API that was entirely decoupled from the TH AST, just as Info
is.
The exact design isn't clear (A new data type? A collection of view functions?) but the design space opens up once it is
decoupled from the TH AST itself.
Other (somewhat diconnected) thoughts
-
With enough (no doubt ghastly) CPP hackery, we could potentially arrange for each version to be compatible with the "internal AST" used by a range of GHC versions. I imagine this would be hard to do over many releases, but even being able to support 2-3 versions simultaneously would signific antly help ease migration.
-
Ideally we would release new minor versions of past major series of the library when a new GHC is released, allowing (partial) forward compatibility with future releases. That raises the question of how to deal with constructors in the "future AST" that aren't supported by the GHC version being used for compilation, but at least they will never show up in quotes, and perhaps simply failing at splice time is acceptable. An advantage of this over version-numbered data constructors (as suggested in !11200 (closed) (comment 525994)) would be that each package version presented a single consistent view of the AST (rather than accumulating pattern synonyms in a single package). I imagine it might be harder to maintain, though.
-
An interesting possibility suggested by @bgamari on #20828 would be to have distinct modules in the package that define exactly the AST of a particular language version (e.g. Haskell2010 or GHC2021). That could potentially be maintained over a longer period. It runs into the issue that new language extensions might introduce syntax that is not representable, but still might be useful in some cases. (Again, there's a question of whether to provide pattern synonym views of the existing AST types, or specialised datatypes plus conversion functions that check whether e.g. some quoted syntax is representable in GHC2021 syntax.)