GADT constructors and the forall-or-nothing rule

Types that adhere to the forall-or-nothing rule must explicitly bind all of their type variables if there is an outermost forall. The converse is that if the type lacks an outermost forall, then any type variables that are not explicitly bound become implicitly quantified. Here is an example of the converse in action:

f :: (forall a. a -> b -> a)                                                                       
f x _ = x

Because of the outermost parentheses, the type signature for f lacks an outer forall. As a result, the b is implicitly quantified. It's as if you had written forall b. (forall a. a -> b -> a).

GADT constructors allegedly adhere to the forall-or-nothing rule, but they have strange behavior vis-à-vis types that begin with outermost parentheses. Consider this example:

data T where
  MkT :: (forall a. a -> b -> T)

Should this be rejected? Because of the outermost parentheses, (forall a. a -> b -> T) is effectively equivalent to forall b. (forall a. a -> b -> T). Moreover, since GADT constructors do not allow nested foralls, it would be reasonable to expect this to be rejected with a "GADT constructor type signature cannot contain nested ‘forall’s or contexts" error.

As it turns out, MkT is rejected, but for all the wrong reasons. Here is the actual error that you get:

error: Not in scope: type variable ‘b’
  |
6 |   MkT :: (forall a. a -> b -> T)
  |                          ^

For some reason, b is not in scope, even though the forall-or-nothing rule would dictate that b be implicitly quantified. But that's not all. Here is another fishy example:

data S a where
  MkS :: (forall a. S a)

Again, the forall-or-nothing rule would suggest that (forall a. S a) is effectively equivalent to forall. (forall a. S a), and because of the nested foralls, GHC ought to reject this. But in reality, GHC accepts this program! (In fact, a user specifically requested that a program like this be accepted in #14320 (closed). As the person who authored a fix for that issue, I can only blame myself for my lack of foresight.)

The reason for both of these oddities is quite simple: GHC eagerly strips away parentheses when constructing GADT constructor types in the parser, in mkGadtDecl:

mkGadtDecl names ty
  = (ConDeclGADT { con_g_ext  = noExtField
                 , con_names  = names
                 , con_forall = L l $ isLHsForAllTy ty'
                 , con_qvars  = mkHsQTvs tvs
                 , con_mb_cxt = mcxt
                 , con_args   = args
                 , con_res_ty = res_ty
                 , con_doc    = Nothing }
    , anns1 ++ anns2)
  where
    (ty'@(L l _),anns1) = peel_parens ty []
    (tvs, rho) = splitLHsForAllTyInvis ty'
    (mcxt, tau, anns2) = split_rho rho []

    split_rho (L _ (HsQualTy { hst_ctxt = cxt, hst_body = tau })) ann
      = (Just cxt, tau, ann)
    split_rho (L l (HsParTy _ ty)) ann
      = split_rho ty (ann++mkParensApiAnn l)
    split_rho tau                  ann
      = (Nothing, tau, ann)

    (args, res_ty) = split_tau tau

    -- See Note [GADT abstract syntax] in GHC.Hs.Decls
    split_tau (L _ (HsFunTy _ (L loc (HsRecTy _ rf)) res_ty))
      = (RecCon (L loc rf), res_ty)
    split_tau tau
      = (PrefixCon [], tau)

    peel_parens (L l (HsParTy _ ty)) ann = peel_parens ty
                                                       (ann++mkParensApiAnn l)
    peel_parens ty                   ann = (ty, ann)

peel_parens looks through parentheses that surround an outer forall, and is therefore the main culprit. (split_rho also looks through parentheses that surround the context, although this doesn't pertain to the `forall-or-nothing rule directly.)

What should we do here? One option is to keep GHC's current behavior. I'm not a huge fan of this idea, since this means we'd have to maintain something that deviates from the forall-or-nothing rule in an unusual way. On the other hand, changing the current behavior would mean breaking the test case from #14320 (closed).

Another option is to make GADT constructor types more faithfully implement the forall-or-nothing by removing the use of peel_parens and replacing splitLHsForAllTyInvis with a function that does not look through parentheses. If we do this, however, that raises some further questions, since this would be rejected for having nested foralls:

data S a where
  MkS :: (forall a. S a)

But this would not be rejected for having nested contexts:

data U a where
  MkU :: (Show a => U a)

Which is a bit odd. We could further clamp down by changing mkGadtDecl so that split_rho does not look through parentheses either. Is this a step too far?

I originally noticed this when implementing #16762 (closed), but this could be fixed independently of that issue.

Edited May 19, 2020 by Ryan Scott

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information