Coercions are not expressions, and coercion variables are not Ids.
Right now, GHC Core doesn't obey the substitution lemma. Specifically, given a :: TYPE (TupleRep '[])
in scope, \ (x :: a). x
is a fine expression. However, if we substitute [a |-> Int ~# Int]
(which is well-kinded), then that expression will not pass Core Lint.
The problem is that we need to keep coercions (always of type ty1 ~#r ty2
for some types ty1
and ty2
and role r
) separate from expressions. This is sometimes done today by looking at types: things that have a coercion type are generally assumed to be coercions, and things without coercion types are not. The problem is that this notion doesn't obey the substitution lemma. A great example is mkLocalIdOrCoVar
, which either makes a regular variable (an Id
) or a CoVar
depending on the type.
Another example is in the definition of ~
, which looks kind of like this:
class (a ~# b) => (a ~ b)
instance (a ~# b) => (a ~ b)
I say "kind of like" because we cannot have an unlifted superclass. But it's a good model. That class definition would get compiled into this datatype definition:
data D~ :: forall k. k -> k -> Constraint where
MkD~ :: forall k (a :: k) (b :: k). { eq_sel :: a ~# b } -> D~ a b
Notice that this is a record with a record selector. (This is how all classes are compiled.) But now we have a problem, because eq_sel
actually has a definition:
eq_sel :: D~ a b -> a ~# b
eq_sel (MkD~ co) = co
This is a function that returns a coercion. But that's not possible in the theory: there is simply no evaluation allowed in coercions. Why do we have this at all? Because when the solver has a Given equality, it needs to get its evidence from somewhere. Specifically, if the solver has g1 :: a ~ b
as a Given, it will look at ~
's superclass and extract a Given a ~# b
. That Given is given a name, say co1
, and it needs a definition. This will be co1 = eq_sel g1
. This binding is tracked in the Bag EvBind
carried around in the solver. In the desugarer, it becomes let co1 = eq_sel g1
. Well -- almost. Because co1
's type is unlifted, we hit the let/app invariant and use a case
binding. This is good, because it means the eq_sel g1
will be sure to be evaluated. All will be OK.
Except in #17267 (closed), all is not OK: the binding for the coercion ends up in a recursive group, and then it's let-bound, unevaluated, and an agent of chaos. The fix for #17267 (closed) is about fixing the constraint solver not to do that, but it's frightening how far GHC got with a recursive binding for a coercion.
The bottom line here is that we must be better in keeping coercions and expressions separate. Specifically:
-
lintCoreExpr
aborts onType
. This is because all types should be arguments to functions; they are handled bylintCoreArgs
. The same should be true for coercions. -
CoVar
s should be born separately fromId
s. That is, removemkLocalIdOrCoVar
. -
Invent a new form of
EvTerm
that binds a coercion. This will not be alet
-binding. -
The desugarer must do SCC analysis to figure out how to desugar the evidence bindings. Currently, this analysis is done after desugaring. But now that we need to desugar equality bindings differently than other ones, it's better to do the SCC analysis first.
-
The desugarer should produce
FloatBind
s, notCoreBind
s as it does today. (FloatBind
allows forcase
bindings.) -
Knock-on effect:
DataCon
will have to be more scrupulous about keeping GADT equalities (which are unlifted) apart from other constraints. This means some work indataConTheta
anddataConSig
(the latter should just be removed, as it's essentially never used).
None of this is required, and I don't think we have a bug beyond #17267 (closed). But it's really a gaping hole in our type system, and it's a matter of good hygiene to sort it out.