Trees that Grow Guidance
The Trees that Grow (TTG) idiom can be used to provide different forms of extensions and variations on an AST. Since April 2018, the HsSyn AST inside GHC supports the TTG idiom. This page provides a set of guiding principles for GHC developers on how to understand and use the TTG idiom in HsSyn.
Context and Scope
The new HsSyn AST supporting the TTG idiom (from now on referred to as TTG HsSyn) is designed to subsume five different representations of Haskell syntax:
 AST GhcPs: the AST used in GHC's parsing phase
 AST GhcRn: the AST used in GHC's renaming phase
 AST GhcTc: the AST used in GHC's typechecking phase
 AST TH: the AST used in Template Haskell
 AST HSE: the AST used in an external tool such as HaskellSrcExts
By "subsume" we mean that it should be possible to instantiate TTG HsSyn to serve all five usecases.
The subsumption of above five ASTs is done by providing instances for the extension type families.
For instance, the AST for GHC's parsing, renaming, and typechecking are defined by providing instances of the extension type families using accordingly the indices GhcPs
, GhcRn
, and GhcTc
.
Here is the actual code providing such instances for the HsExpr
datatype of expressions in the TTG HsSyn.
General pattern for TTG
In general, a TTGidiom data type has
 A type parameter, called the phase descriptor, that indexes which particular instantiation is required
 One extension field in each data constructor, whose type is given by a type family. By giving phasespecific instances to this type family, we can add phasespecific information to the constructor.
 One unary extension constructor for each data type, whose argument type is given by a type family. By giving phasespecific instances to this type family, we can add extra phasespecific constructors to the type.
For example:
data Exp x
= Var (XVar x) (IdP x)
 Lam (XLam x) (IdP x) (Exp x)
 App (XApp x) (Exp x) (Exp x)
 New (XNew x)
type family XVar x
type family XLam x
type family XApp x
type family XNew x
Here the phase descriptor is x
. The first field of each constructor (of type XVar x
etc) are the extension fields. The data constructor XNew
is the extension constructor.
All fields of the data constructors except the first (extension) field are called payload fields. They are present in every instantiation of the data type.
Guiding Principles
The design of TTG HsSyn follows these principles:

The base TTG HsSyn should have all the constructors common across all five ASTs (the common data constructors). These constructors should have, as payload fields, all the fields common across all five ASTs.

Note, however, that the type of a payload field of a constructor may vary with phase. For example, in
Lam
above, the first payload field has typeId x
, and that may vary with phase:
type family IdP x
type instance IdP GhcPs = RdrName
type instance IdP GhcRn = Name
type instance IdP GhcTc = Id
But it is still a payload field, because every instantiation of Exp
has a lambda with a binder; albeit the type of that binder field varies. This happens in HsSyn: for example, the type of the common (payload) field of the common constructor HsVar
of HsExpr x
is IdP x
where IdP
is a type family and x
the phase descriptor.

The nonpayload (i.e. phasespecific) fields of a data constructor are grouped together and introduced via the extension field. Similarly the phasespecific data constructors are introduced using the extension constructor.

The instantiation of TTG HsSyn, for a particular phase, should result in a tree that has no redundant fields and constructors.
For example, the
HsExpr GhsPs
expressions of AST GhcPs should not have the constructorHsUnboundVar
of the postrenaming phases, or itsHsMultiIf
constructor should also not have an unused field (of the typeType
) to store the related type produced in the typechecking phase.
As a result, the instantiated TTG HsSyn should not depend on the code from the other phases. Hence, the base (uninstantiated) TTG HsSyn should not depend on any GHC/TH/HSEspecific code.
For example, if
HsExpr GhsPs
expressions of AST GhcPs had the constructorHsUnboundVar
then it had to depend on the code definingUnboundVar
(a field ofHsUnboundVar
) in the renaming phase, or if its constructorMultiIf
had a field of typeType
then it had to depend on the code definingType
in the typechecking phase.
Example
Consider the following three simple datatypes ExpPs
, ExpRn
, and ExpTc
representing correspondingly expressions in a parsing, renaming and typechecking phase:
module Parsing where
 the definition of RdrName
 ...
data ExpPs
= Var RdrName
 Lam RdrName ExpPs
 App ExpPs ExpPs
module Renaming where
 the definition of `Name` and `UnboundVar`
 ...
data ExpRn
= Var Name
 Lam Name ExpRn
 App ExpRn ExpRn
 UVar UnboundVar
module Typechecking where
 the definition of `Id`, `UnboundVar`, and `Type`
 ...
data ExpTc
= Var Id
 Lam Id ExpTc
 App Type ExpTc ExpTc
 UVar UnboundVar
Based on the TTG idiom, we will have a base declaration such as the following.
{# LANGUAGE TypeFamilies , EmptyCase #}
module AST where
data Exp x
= Var (XVar x) (XId x)
 Abs (XAbs x) (XId x) (Exp x)
 App (XApp x) (Exp x) (Exp x)
 New (XNew x)
type family XVar x
type family XAbs x
type family XApp x
type family XNew x
type family XId x
data NoExt = NoExt  no field extension
data NoNewCon  no constructor extension
noNewCon :: NoNewCon > a
noNewCon x = case x of {}
and the following three instantiations:
{# LANGUAGE TypeFamilies #}
module Parsing where
import AST
data RdrName  = ...
data Ps
type ExpPs = Exp Ps
type instance XVar Ps = NoExt
type instance XAbs Ps = NoExt
type instance XApp Ps = NoExt
type instance XNew Ps = NoNewCon
type instance XId Ps = RdrName
{# LANGUAGE TypeFamilies #}
module Renaming where
import AST
data Name  = ...
data UnboundVar  = ...
data Rn
type ExpRn = Exp Rn
type instance XVar Rn = NoExt
type instance XAbs Rn = NoExt
type instance XApp Rn = NoExt
type instance XNew Rn = UnboundVar
type instance XId Rn = Name
{# LANGUAGE TypeFamilies #}
module Typechecking where
import AST
data Id  = ...
data UnboundVar  = ...
data Type  = ...
data Tc
type ExpTc = Exp Tc
type instance XVar Tc = NoExt
type instance XAbs Tc = NoExt
type instance XApp Tc = Type
type instance XNew Tc = UnboundVar
type instance XId Tc = Id
Note that we define a specific pair of datatypes to mark and handle empty extension fields and constructors.