Allow nested foralls and contexts in prefix GADT constructors
Currently, prefix GADT constructors are very particular about where forall
s and contexts are permitted in their types. In particular, the very beginning of a prefix GADT type can have an optional forall
, followed by an optional context, followed by the rest of the type. Any other forall
s or contexts that appear between argument types or the result type are rejected. For example, GHC does not permit this:
data T a where
MkT :: Int -> forall a. T a
This is rather surprising, even to users who aren't me. Fundamentally, there is no reason why this has to be the case. In fact, having nested forall
s is an implicit requirement an already accepted GHC proposal, A syntax for visible dependent quantification. Although !378 (merged) implemented most of this proposal, there is one part that it deliberately did not implement, namely:
A data constructor can use
forall ... ->
in its type (as given in GADT-syntax) or arguments, but any use of such a constructor in terms (as opposed to in a type) will be an error.
Without having the ability to use nested forall
s in prefix GADT constructors, however, this feature is of limited utility. This is because you would be able to write this:
data VDQ a where
MkVDQ :: forall a -> Int -> VDQ a
However, you would not be able to make a simple rearrangement of the arguments like this:
data VDQ' a where
MkVDQ' :: Int -> forall a -> VDQ a -- Nested forall! Rejected.
So why doesn't today's GHC permit nested forall
s/contexts in GADT constructors? Mainly because of engineering concerns. Various data structurs in the internals of GHC assume that all forall
'd type variables and constraint arguments all occur upfront. This assumption is baked into the design of ConDeclGADT
and DataCon
, among other things.
How should we go about relaxing this restriction? While I haven't worked out the details completely yet, at minimum the following will need to happen:
-
The representation of arguments in prefix
ConDeclGADT
s will need to change. Currently, thecon_args
field ofConDeclGADT
usesHsConDeclDetails
to represent the argument types of the constructor. This currently assumes that each argument is aBangType
, however. In order to make this accommodate nestedforall
s and contexts, however, we will likely need a richer data structure. Something like:data FunArg pass = AnonArg (HsScaled pass (LBangType pass)) | ForallArg (HsForAllTelescope pass) | ContextArg (LHsContext pass)
If we do this, then we can likely get rid of
con_forall
andcon_qvars
, which currently store the leadingforall
s and context. Well, except that isn't quite right—GADT constructor types obey theforall
-or-nothing rule, and as a result, leadingforall
s are special. Bah. I'll have to think carefully about how to do this right.Note that all of this only really applies to prefix GADT constructors. Record GADT constructors, on the other hand, can't really have nested
forall
s or contexts due to the syntax that they use. I suppose this means we might need a different AST form for record GADT constructors than we do for prefix GADT constructors. Hm. -
The way that user-written arguments are represented in
DataCon
s will need to change. Currently,DataCon
looks like this:data DataCon = MkData { ... , dcUnivTyVars :: [TyVar] , dcExTyCoVars :: [TyCoVar] , dcUserTyVarBinders :: [InvisTVBinder] , dcEqSpec :: [EqSpec] , dcOtherTheta :: ThetaType , dcOrigArgTys :: [Type] , ... }
For the most part, this corresponds exactly to how the data type worker looks. It must have the universally quantified type variables (
dcUnivTyVars
), followed by the existentially quantified type variables (dcExTyCoVars
), followed by the context arguments (dcEqSpec
+dcOtherTheta
), finally followed by the visible arguments (dcOrigArgTys
), in exactly that order.The exception to that rule is
dcUserTyVarBinders
, which remembers the order in which the user quantifies type variables in a GADT constructor. This information is not for the benefit of the worker, but for the wrapper. There is some amount of type variable swizzling performed inGHC.Types.Id.Make.mkDataConRep
to ensure that the user-written order indcUserTyVarBinders
is rearranged to the universals-then-existentials order that the worker expects.If we permit nested
forall
s and contexts, the first step is to figure out how to changeDataCon
to represent it. I imagine that we will need to replacedcUserTyVarBinders
with something more general:{ ... , dcUserTyCoBinders :: [TyCoBinder] , ... }
We will need
TyCoBinder
s because theforall
s and contexts can be intermixed with visible argument types, and moreover, these can happen in basically any order. What's interesting about this change isdcUserTyCoBinders
will store all of the arguments to aDataCon
, which will technically makedcUnivTyVars
,dcExTyCoVars
,dcEqSpec
,dcOtherTheta
, anddcOrigArgTys
redundant. Still, we might consider keeping some (or all) of these fields anyway, since it could be more efficient to cache them rather than repeatedly computing them from thedcUserTyCoBinders
. -
We will need to update Template Haskell's representation of prefix GADTs, which is insufficient to represent nested
forall
s and contexts:data Con = ... | ForallC [TyVarBndr] Cxt Con | GadtC [Name] [BangType] Type | ...
It is unlikely that we will be able to make this work without redesigning
Con
to some extend. @rae suggests the following design:data Dec = ... | DataD ... Cons ... | NewtypeD ... Cons ... -- Alternatively, ... (Either H98Con GadtCon) ... if we want to really -- enforce the invariant that a newtype has exactly one data constructor data Cons = H98Cons [H98Con] | GadtCons [GadtCon] data H98Con = H98Con (Maybe [TyVarBndr]) (Maybe Cxt) Name H98ConBody data H98ConBody = NormalC ... | InfixC ... | RecC ... data GadtCon = GadtCon [Name] GadtConBody data GadtConBody = GadtC [FunArg ()] Type | RecGadtC [FunArg Name] Type data FunArg name = FunArg name Multiplicity Bang Type | ForallArg [TyVarBndr] | CxtArg Cxt