GADT constructors and the forall-or-nothing rule
Types that adhere to the forall
-or-nothing rule must explicitly bind all of their type variables if there is an outermost forall
. The converse is that if the type lacks an outermost forall
, then any type variables that are not explicitly bound become implicitly quantified. Here is an example of the converse in action:
f :: (forall a. a -> b -> a)
f x _ = x
Because of the outermost parentheses, the type signature for f
lacks an outer forall
. As a result, the b
is implicitly quantified. It's as if you had written forall b. (forall a. a -> b -> a)
.
GADT constructors allegedly adhere to the forall
-or-nothing rule, but they have strange behavior vis-à-vis types that begin with outermost parentheses. Consider this example:
data T where
MkT :: (forall a. a -> b -> T)
Should this be rejected? Because of the outermost parentheses, (forall a. a -> b -> T)
is effectively equivalent to forall b. (forall a. a -> b -> T)
. Moreover, since GADT constructors do not allow nested forall
s, it would be reasonable to expect this to be rejected with a "GADT constructor type signature cannot contain nested ‘forall’s or contexts
" error.
As it turns out, MkT
is rejected, but for all the wrong reasons. Here is the actual error that you get:
error: Not in scope: type variable ‘b’
|
6 | MkT :: (forall a. a -> b -> T)
| ^
For some reason, b
is not in scope, even though the forall
-or-nothing rule would dictate that b
be implicitly quantified. But that's not all. Here is another fishy example:
data S a where
MkS :: (forall a. S a)
Again, the forall
-or-nothing rule would suggest that (forall a. S a)
is effectively equivalent to forall. (forall a. S a)
, and because of the nested forall
s, GHC ought to reject this. But in reality, GHC accepts this program! (In fact, a user specifically requested that a program like this be accepted in #14320 (closed). As the person who authored a fix for that issue, I can only blame myself for my lack of foresight.)
The reason for both of these oddities is quite simple: GHC eagerly strips away parentheses when constructing GADT constructor types in the parser, in mkGadtDecl
:
mkGadtDecl names ty
= (ConDeclGADT { con_g_ext = noExtField
, con_names = names
, con_forall = L l $ isLHsForAllTy ty'
, con_qvars = mkHsQTvs tvs
, con_mb_cxt = mcxt
, con_args = args
, con_res_ty = res_ty
, con_doc = Nothing }
, anns1 ++ anns2)
where
(ty'@(L l _),anns1) = peel_parens ty []
(tvs, rho) = splitLHsForAllTyInvis ty'
(mcxt, tau, anns2) = split_rho rho []
split_rho (L _ (HsQualTy { hst_ctxt = cxt, hst_body = tau })) ann
= (Just cxt, tau, ann)
split_rho (L l (HsParTy _ ty)) ann
= split_rho ty (ann++mkParensApiAnn l)
split_rho tau ann
= (Nothing, tau, ann)
(args, res_ty) = split_tau tau
-- See Note [GADT abstract syntax] in GHC.Hs.Decls
split_tau (L _ (HsFunTy _ (L loc (HsRecTy _ rf)) res_ty))
= (RecCon (L loc rf), res_ty)
split_tau tau
= (PrefixCon [], tau)
peel_parens (L l (HsParTy _ ty)) ann = peel_parens ty
(ann++mkParensApiAnn l)
peel_parens ty ann = (ty, ann)
peel_parens
looks through parentheses that surround an outer forall
, and is therefore the main culprit. (split_rho
also looks through parentheses that surround the context, although this doesn't pertain to the `forall-or-nothing rule directly.)
What should we do here? One option is to keep GHC's current behavior. I'm not a huge fan of this idea, since this means we'd have to maintain something that deviates from the forall
-or-nothing rule in an unusual way. On the other hand, changing the current behavior would mean breaking the test case from #14320 (closed).
Another option is to make GADT constructor types more faithfully implement the forall
-or-nothing by removing the use of peel_parens
and replacing splitLHsForAllTyInvis
with a function that does not look through parentheses. If we do this, however, that raises some further questions, since this would be rejected for having nested forall
s:
data S a where
MkS :: (forall a. S a)
But this would not be rejected for having nested contexts:
data U a where
MkU :: (Show a => U a)
Which is a bit odd. We could further clamp down by changing mkGadtDecl
so that split_rho
does not look through parentheses either. Is this a step too far?
I originally noticed this when implementing #16762 (closed), but this could be fixed independently of that issue.