Skip to content
  • Ryan Scott's avatar
    Track the order of user-written tyvars in DataCon · ef26182e
    Ryan Scott authored
    After typechecking a data constructor's type signature, its type
    variables are partitioned into two distinct groups: the universally
    quantified type variables and the existentially quantified type
    variables. Then, when prompted for the type of the data constructor,
    GHC gives this:
    
    ```lang=haskell
    MkT :: forall <univs> <exis>. (...)
    ```
    
    For H98-style datatypes, this is a fine thing to do. But for GADTs,
    this can sometimes produce undesired results with respect to
    `TypeApplications`. For instance, consider this datatype:
    
    ```lang=haskell
    data T a where
      MkT :: forall b a. b -> T a
    ```
    
    Here, the user clearly intended to have `b` be available for visible
    type application before `a`. That is, the user would expect
    `MkT @Int @Char` to be of type `Int -> T Char`, //not//
    `Char -> T Int`. But alas, up until now that was not how GHC
    operated—regardless of the order in which the user actually wrote
    the tyvars, GHC would give `MkT` the type:
    
    ```lang=haskell
    MkT :: forall a b. b -> T a
    ```
    
    Since `a` is universal and `b` is existential. This makes predicting
    what order to use for `TypeApplications` quite annoying, as
    demonstrated in #11721 and #13848.
    
    This patch cures the problem by tracking more carefully the order in
    which a user writes type variables in data constructor type
    signatures, either explicitly (with a `forall`) or implicitly
    (without a `forall`, in which case the order is inferred). This is
    accomplished by adding a new field `dcUserTyVars` to `DataCon`, which
    is a subset of `dcUnivTyVars` and `dcExTyVars` that is permuted to
    the order in which the user wrote them. For more details, refer to
    `Note [DataCon user type variables]` in `DataCon.hs`.
    
    An interesting consequence of this design is that more data
    constructors require wrappers. This is because the workers always
    expect the first arguments to be the universal tyvars followed by the
    existential tyvars, so when the user writes the tyvars in a different
    order, a wrapper type is needed to swizzle the tyvars around to match
    the order that the worker expects. For more details, refer to
    `Note [Data con wrappers and GADT syntax]` in `MkId.hs`.
    
    Test Plan: ./validate
    
    Reviewers: austin, goldfire, bgamari, simonpj
    
    Reviewed By: goldfire, simonpj
    
    Subscribers: ezyang, goldfire, rwbarton, thomie
    
    GHC Trac Issues: #11721, #13848
    
    Differential Revision: https://phabricator.haskell.org/D3687
    ef26182e