This wikipage describes a design for putting source locations inside a new extension point of TTG.
The motivation is to allow GHC to sprinkle source locations across many nodes of the syntax tree,
without forcing every client of HsSyn to do so. For example Template Haskell probably does not want to.
(NB: This wiki page was overhauled in Jan 2020. If you came here via a link, a very similar design to this was called 'Solution D' in the previous version: see the version history.)
Recall that the TTG paper
indexes each data type with a type index, and defines types with two extension points:
Each data type has an extension constructor.
Each data construcutor has an extension field.
data Expr x = Var (XVar x) (Var x) | App (XApp x) (Expr x) (Expr x) | Lam (XLam x) (Var x) (Expr x) | XExpr (XXExpr x)type family XVar xtype family XApp xtype family XLam xtype family XXExpr x
Here the type index is x; the extension constructor is XExpr; and
the extension fields are XVar x, XApp x, etc (i.e. the first field
of each constructor). The types XVar, XApp etc are type families,
that can be extended with new instances as you add new indexing types.
We add a third extension point:
Selected fields, often the recursive uses, have an extension wrapper, XRec.
data Expr x = Var (XVar x) (Var x) | App (XApp x) (XRec x (Expr x)) (XRec x (Expr x)) | Lam (XLam x) (XRec x (Var x)) (XRec x (Expr x)) | XExpr (XXExpr x)type family XRec x a
Again, XRec is typically a type family. For some indices (say Vanilla)
we can easily elide all these XRec wrappers:
type instance XRec Vanilla a = a
But for GHC, with type index GhcPass p, we can use it to add a source location for each XRec:
type instance XRec (GhcPass p) a = Located a
Note: There is nothing inherently recursive about XRec. It has that name
because it is often
used to wrap recursive fields, such as the arguments to App above.
But it is also used to wrap non-recursive fields, such as the XRec x (Var x) field of
For name bikeshedding, see #17587.)
Using XRec in GHC
Becoming specific to GHC, wherever we used Located in the AST previously, we now use XRec pass:
When we have an LHsDecl (GhcPass p) this yields the same AST as before, which makes refactoring much easier.
The idea of this refactoring is that XRec pass really just replaces the usage of Located across the AST everywhere,
this can be inside the LHs* type synonyms or record fields.
Remember: We have to replace every Located in the AST to achieve the goal of moving SrcLocs to an extension point.
Here are some other usecases of XRec across the AST:
(Located was sometimes wrapped around non-AST data like CType or simply Bool)
Some of the thinking
We want to have an AST that can be used for both TH and normal Hs, for example. This means we need to store source locations in an extension point of TTG (trees that grow).
In #15495 we agreed that the currently available extension points in TTG don't suffice to annotate the AST with SrcLocs in a satisfyingly type-safe manner: Something akin to 'Ping-pong' style is desireable.
Using a type family XRec instead of Located enables the extension points to carry source locations, or not, or even something else, everywhere that Located would be used today.
Applications for this include:
Using the GHC TTG AST for TemplateHaskell. TH doesn't have any SrcLocs attached to it, so it would use
GHC's functions' bodies will mostly not need to change (with some exceptions). This refactor pretty much only touches the types.
Some of GHC's functions used unLoc :: Located a -> a, but were polymorphic in the pass before:
And lastly, some instance declarations that used TypeSynonymInstances now need to be expanded, now that we're using type families inside those type synonyms.
Haddock doesn't use GhcPass p, but it uses source locations and GHC's AST heavily. Luckily we can just define an XRec instance for their pass datakind: DocNameI:
All functions that both GHC and Haddock depend on need to be polymorphic over the pass type variable however. Luckily again, there are not many of these functions, they mostly live in Ghc/Hs/Utils.hs. They will need another constraint: XRec pass (Match pass) ~ Located (Match pass), for example.
The design described here is not merged into GHC as of yet. Current status is:
An older variant of the design (pre #17587) is implemented and merged for Pat.hs.
The newer variant for Pat.hs is implemented in !2315
The design is applied throughout GHC in !2315, but that is still WIP.
For previous status/discussions, see the older version of this file or the related issues.
Related Issues / MRs
Initial discussion: #15495
-> Discussion settles in the design of !1970 (closed) (From now on referred to as the 'LPat experiment')
Follow up ticket: #17587 -> Simplifies extension point design and makes it more flexible
Follow up merge request: !2315 (Expanding the 'LPat experiment' across the compiler)
This way it is possible to distinguish an expression which has source locations attached and one which doesn't have source locations attached on the type level. LExpr doesn't unify with Expr.
If this were done with the constructor extension point of TTG, then one would lose some type safety: There would no longer be a guaruntee that there will always be a Located layer between the Expr layers in our huge expression sandwich.