This page discusses the interaction of Trees That Grow with derived for
Data. (But it might apply to manual instances for, say,
This page is part of ImplementingTreesThatGrow.
Here's our example:
type family XOverLit p data HsOverLit p = OverLit (XOverLit p) (HsExpr p) data GhcPass (c :: Pass) deriving instance Typeable c => Data (GhcPass c) data Pass = Parsed | Renamed | Typechecked type instance XOverLit (GhcPass 'Parsed ) = PlaceHolder type instance XOverLit (GhcPass 'Renamed ) = Name type instance XOverLit (GhcPass 'Typechecked) = Type
We want a
Data instance for this type.
deriving instance (Typeable p, Data (XOverLit (GhcPass p))) => Data (HsOverLit (GhcPass p))
But that gives rise to big constraint sets; for each data constructor
we get another
X field, and another constraint in the
This is what is currently implemented in GHC master. It works, but every time the constraint set is enlarged as the next step of Trees that Grow goes in, the time taken to compile GHC increases.
Alan would like to see
deriving instance Data (HsOverLit (GhcPass p))
which should have all the information available to the derivation process, since
p can only have one of three values and each of them has a type family instance for
This "does not compile".
Alan has discovered that the three instances:
deriving instance Data (HsOverLit (GhcPass 'Parsed )) deriving instance Data (HsOverLit (GhcPass 'Renamed)) deriving instance Data (HsOverLit (GhcPass 'Typechecked))
will compile, but only with GHC 8.2.1, not with 8.0.2, due to a flaw in the standalone deriving process.
That is: instead of one
Data instance for the
[The spurious constraint problem was resolved by including
deriving instance Typeable c => Data (GhcPass c), as recommended by @RyanGlScott ]
Proceed as per Plan A, but move all the standalone deriving code to a new file, which will produce orphan instances.
Inside this, use CPP to detect a modern enough compiler (GHC 8.2.1) to generate via Plan B, otherwise fall back to Plan A.
Eventually improve the standalone deriving sufficiently that it is able to generate a single traversal, instead of the three.
So for day-to-day work ghc devs can use GHC 8.2.1, and we confirm is still works with GHC 8.0.2 less frequently.
(This is a bit like Plan C/D, but without the stupid error.)
Suppose we declared out extension fields like this
data HsOverLit p = OverLit (GhcExt (XOverLit p)) (HsExpr p) newtype GhcExt x = Ext x
Now we could execute on Plan C by saying
instance Data (GhcExt x) where gmapM f x = x ...
The downside is that every pattern match and construction would need to wrap and unwrap with
But actually we are going to need to do that anyway! We are planning to abolish the alternation of
Located t and
t by putting the SrcSpan for the construct in its extension field. Like this
data GhcExt x = Ext SrcSpan x
So we have to do this wrapping business anyway!
This approach would mean that every client of
HsSyn would have to have a
GhcExt in every node, with a
SrcSpan. If that's not OK (and it probably isn't ok) we could do this:
data HsOverLit p = OverLit (XOverLit p) (HsExpr p) type instance XOverLit (GhcPass p) = GhcExt (GhcXOverlit p)
The price is that there are two type-level functions for each constructor. The benefit is that we can do Plan C.
Actually, in the common case where GHC does not use any extension fields for a constructor
K we could dispense with the second function thus
type instance XK (GhcPass p) = GhcExt PlaceHolder
Let's focus on
Outputable for now. The underlying problem is that to have a single pretty printer that does a good job of printing every instantation of
HsExpr x is really hard:
There are many, many fields within
HsExpr, of type
X2 x, ...
X100 x, where the
Xiare all type functions. So a truly-generic pretty-printer for
HsExprwould have to take 100 pretty-printers as its arguments.
It'd be difficult to do layout that worked regardless of the way in which the type was instantiated.
Here's a way to get very small instance contexts, at the cost of writing a GHC-specific pretty-printer. (Leaving the challenge of a truly generic pretty printer for another day.)
data ThePass p where ThePassPs :: ThePass GhcPs ThePassRn :: ThePass GhcRn ThePassTc :: ThePass GhcTc class GhcPassC p where thePass :: ThePass p data HsExpr p = HsVar (XVar p) Int | ... type instance XVar (GhcPass p) = GhcXVar p type family GhcXVar p where GhcXVar GhcPs = RdrName GhcXVar GhcRn = Name GhcXVar GhcTc = Id instance (GhcPassC p => Outputable (HsExpr (GhcPass p)) where ppr (HsVar x i) = ppr i <+> case (thePass @p) of ThePassPs -> ppr x -- RdrName ThePassRn -> ppr x -- Name ThePassTc -> ppr x -- Id
Outputable instance takes
GhcPassC p as a context, which is just a simple data contructor. Then we dispatch
on that constructor when we want to pretty-print that piece.
This would be a huge step forward over the pretty printer GHC has had for ages, where the pass-specific structures are entirely un-printable. And it'd be simple and efficient.
I have no idea how this would affect
The performance issue of Plan A presumably comes from the fact that we have instance search running every time we are typechecking an instance with a large constraint set. That is:
type LargeConstraintSet p = ( Outputable (XIPBinds p) , Outputable (XViaStrategy p) , ... , ... , ... , ... -- and so on, for each extension field ) instance LargeConstraintSet p => Outputable (HsExpr p) instance LargeConstraintSet p => Outputable (HsPat p) instance LargeConstraintSet p => Outputable (HsType p) instance LargeConstraintSet p => Outputable (HsBinds p) instance (LargeConstraintSet idL, LargeConstraintSet idR, Outputable body) => Outputable (StmtLR idL idR body) ...
Every time we add something to the
LargeConstraintSet, typechecking each of these instances slows down.
Instead, we could create a single constraint,
OutputableX, and hide the large constraint set inside it:
class OutputableX p where withOutputableX :: (LargeConstraintSet p => r) -> r
instance OutputableX p => Outputable (HsExpr p) instance OutputableX p => Outputable (HsPat p) instance OutputableX p => Outputable (HsType p) instance OutputableX p => Outputable (HsBinds p) instance (OutputableX idL, OutputableX idR, Outputable body) => Outputable (StmtLR idL idR body)
becomes fast, as it only has one constraint:
OutputableX. Operationally, we have
LargeConstraintSet p stored inside
OutputableX, but for the typechecker, this is hidden information.
We will have only three instances of
OutputableX, defined as follows:
instance OutputableX (GhcPass 'Parsed) where withOutputableX r = r instance OutputableX (GhcPass 'Renamed) where withOutputableX r = r instance OutputableX (GhcPass 'Typechecked) where withOutputableX r = r
Therefore, we have to do instance resolution for
LargeConstraintSet only once per pass (to define the above instances).
Now, whenever a piece of code needs something from the large constraint set, it calls
- ppr (IPBinds ds bs) = pprDeeperList vcat (map ppr bs) + ppr (IPBinds ds bs) = withOutputableX @p $ + pprDeeperList vcat (map ppr bs)
withOutputableX @p, we have
LargeConstraintSet p in context.
Plan H relies on the assumption that calls to
withOutputableX will not lead to the same kind of performance degradation. This is an untested, but hopefully true assumption. In case it proves to be wrong, we can augment this plan by having separate methods for each constraint:
class OutputableX p where withOutputableXIPBinds :: (Outputable (XIPBinds p) => r) -> r withOutputableXViaStrategy :: (Outputable (XViaStrategy p) => r) -> r ... ... ... -- and so on, for each extension field
With Plan H*, the code will be asking only for the constraint that it actually needs:
- ppr (IPBinds ds bs) = pprDeeperList vcat (map ppr bs) + ppr (IPBinds ds bs) = withOutputableXIPBinds @p $ + pprDeeperList vcat (map ppr bs)
This means that we're not dealing with
LargeConstraintSet at all, we're working with individual parts of it.
PLAN C (Infeasible)
It is still painful generating three virtually identical chunks of traversal code. So suppose we went back to
deriving instance (...) => Data (OverLit (GhcPass p)) where…
We'd get unresolved constraints like
Data (XOverLit (GhcPass p)). Perhaps we
could resolve them like this
instance Data (XOverLIt (GhcPass p)) where gmapM f x = x ...etc...
That is: a no-op
Data instances. Traversals would not traverse extension fields.
That might not be so bad, for now!
PLAN D (Infeasible)
If there were cases when we really did want to look at those extension fields, we still could, by doing a runtime test, like this:
instance IsGhcPass p => Data (XOverLIt (GhcPass p)) where gmapM = case ghcPass @p of IsParsed -> gmapM IsRenamed -> gmapM IsTypechecked -> gmapM ...etc...
Here I'm positing a new class and GADT:
class IsGhcPass p where ghcPass :: IsGhcPassT p data IsGhcPassT p where IsParsed :: IsGhcPass Parsed IsRenamed :: IsGhcPass Renamed IsTypechecked :: IsGhcPass Typechecked instance IsGhcPass Parsed where ghcPass = IsParsed ...etc...
Now, instead of passing lots of functions, we just pass a single GADT value which we can dispatch on.
We can mix Plan C and D.
We are solving a compilation time problem for GHC stage1/stage2. The produced compiler has the same performance regardless of which derivation mechanism.
Ideally we would like to end up with
datainstances for an arbitrary
OverLit p, which is an outcome for Plan A. i.e. Plan A will play nice with external index types,for GHC API users.