Core Lint error around pure unifier and coercions

This ticket is closely related to #20146, #20172

If I say

{-# LANGUAGE ExplicitForAll, PolyKinds, StandaloneKindSignatures,
             TypeFamilies, DataKinds #-}

module Bug where

import Data.Kind

type F :: Type -> Type
type family F k where
  F k = Type -> Type

type family G a where
  forall (k :: Type) (a :: F k) (b :: Type). G (a b, k) = Char

x :: G (Maybe Int, Bool)
x = 'y'

I get (with -dcore-lint)

*** Core Lint errors : in result of Desugar (before optimization) ***
Bug.hs:16:1: warning:
    The type variable @k_auc is out of scope
    In the RHS of x :: G (Maybe Int, Bool)
    Substitution: [TCvSubst
                     In scope: InScope {}
                     Type env: []
                     Co env: []]
*** Offending Program ***
Rec {
$trModule :: Module
[LclIdX]
$trModule = Module (TrNameS "main"#) (TrNameS "Bug"#)

x :: G (Maybe Int, Bool)
[LclIdX]
x = (C# 'y'#)
    `cast` (Sub (Sym (D:R:G[0]
                          <Bool>_N <(Maybe |> Sym (D:R:F[0] <k_auc>_N))>_N <Int>_N))
            :: Char ~R# G (Maybe Int, Bool))
end Rec }

*** End of Offense ***

The problem is the kco parameter in the pure unifier.

When we have target: ty1 |> co and template: ty2, GHC remembers co as the relationship between the kinds of ty1 and ty2, and then matches ty1 against ty2. The problem is that this "remembered coercion" (called kco in the code) sometimes comes from the template (as here) and sometimes from the target. It thus mixes variables from both the template and the target.

This is addressed in Note [Matching in the presence of casts (1)] in GHC.Core.Unify. However, that Note is subtly wrong, in the presence of a type like (a |> co) b, which is what we have here. Specifically, we have

template variables: k :: Type, a :: F k, b :: Type template: ((a |> axF[k]) b, k) target: (Maybe Int, Bool)

where axF :: [k]. F k ~ (Type -> Type). (The [k] notation says that k is a parameter of the axiom.)

The matcher goes left-to-right, and thus matches a |-> Maybe |> sym axF[k] before it matches k. The k thus ends up in the final substitution, free.

Possible solutions:

Require the kco to mention only variables from the target, never the template (as today). Then, whenever the matcher sees a coercion in the template (e.g. ty |> co), create a fresh variable cv and remember the coercion spotted in the template (e.g. co). At the end, apply the substitution that is the result of the (successful) match to the remembered coercions, and then build a substitution mapping the fresh cvs to the substituted remembered cos. This would work, but it would perhaps need to run this process several times. Essentially, this is all about propagating the information about k back to an earlier point in the match.
Require the template and the target to have distinct sets of free variables. Then, we don't have to worry about mixing them together. At the end of a successful match, just find the fixpoint of the substitution. (This fixpoint operation already happens for two-way unification -- not one-way matching -- which is done by the same body of code. This approach might thus simplify some of the mediating between matching and unification already present.) We could choose to skip the fixpoint calculation if no coercions were encountered in the template, the vastly common case.
Forbid type families in the kinds of template variables (much like type families are generally forbidden in templates). Since we have no free coercion variables in templates and no type families in them either, it would seem that all coercions must be reflexive, thus sidestepping the problem. This seems delicate though, and likely not to be true forever (example: unsafeCoerce in types).

Right now, (2) seems most appealing.

Edited Jul 27, 2021 by Simon Peyton Jones

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information