Skip to content

Fix and simplify DeriveAnyClass's context inference using SubTypePredSpec

Ryan Scott requested to merge wip/T20719 into master

As explained in Note [Gathering and simplifying constraints for DeriveAnyClass] in GHC.Tc.Deriv.Infer, DeriveAnyClass infers instance contexts by emitting implication constraints. Previously, these implication constraints were constructed by hand. This is a terribly trick thing to get right, as it involves a delicate interplay of skolemisation, metavariable instantiation, and TcLevel bumping. Despite much effort, we discovered in #20719 (closed) that the implementation was subtly incorrect, leading to valid programs being rejected.

While we could scrutinize the code that manually constructs implication constraints and repair it, there is a better, less error-prone way to do things. After all, the heart of DeriveAnyClass is generating code which fills in each class method with defaults, e.g., foo = $gdm_foo. Typechecking this sort of code is tantamount to calling tcSubTypeSigma, as we much ensure that the type of $gdm_foo is a subtype of (i.e., more polymorphic than) the type of foo. As an added bonus, tcSubTypeSigma is a battle-tested function that handles skolemisation, metvariable instantiation, TcLevel bumping, and all other means of tricky bookkeeping correctly.

With this insight, the solution to the problems uncovered in #20719 (closed) is simple: use tcSubTypeSigma to check if $gdm_foo's type is a subtype of foo's type. As a side effect, tcSubTypeSigma will emit exactly the implication constraint that we were attempting to construct by hand previously. Moreover, it does so correctly, fixing #20719 (closed) as a consequence.

This patch implements the solution thusly:

  • The PredSpec data type (previously named PredOrigin) is now split into SimplePredSpec, which directly stores a PredType, and SubTypePredSpec, which stores the actual and expected types in a subtype check. SubTypePredSpec is only used for DeriveAnyClass; all other deriving strategies use SimplePredSpec.
  • Because tcSubTypeSigma manages the finer details of type variable instantiation and constraint solving under the hood, there is no longer any need to delicately split apart the method type signatures in inferConstraintsAnyclass. This greatly simplifies the implementation of inferConstraintsAnyclass and obviates the need to store skolems, metavariables, or given constraints in a ThetaSpec (previously named ThetaOrigin). As a bonus, this means that ThetaSpec now simply becomes a synonym for a list of PredSpecs, which is conceptually much simpler than it was before.
  • In simplifyDeriv, each SubTypePredSpec results in a call to tcSubTypeSigma. This is only performed for its side effect of emitting an implication constraint, which is fed to the rest of the constraint solving machinery in simplifyDeriv. I have updated Note [Gathering and simplifying constraints for DeriveAnyClass] to explain this in more detail.

To make the changes in simplifyDeriv more manageable, I also performed some auxiliary refactoring:

  • Previously, every iteration of simplifyDeriv was skolemising the type variables at the start, simplifying, and then performing a reverse substitution at the end to un-skolemise the type variables. This is not necessary, however, since we can just as well skolemise once at the beginning of the deriving pipeline and zonk the TcTyVars after simplifyDeriv is finished. This patch does just that, having been made possible by prior work in !7613 (closed). I have updated Note [Overlap and deriving] in GHC.Tc.Deriv.Infer to explain this, and I have also left comments on the relevant data structures (e.g., DerivEnv and DerivSpec) to explain when things might be TcTyVars or TyVars.
  • All of the aforementioned cleanup allowed me to remove an ad hoc deriving-related in checkImplicationInvariants, as all of the skolems in a tcSubTypeSigma–produced implication constraint should now be TcTyVar at the time the implication is created.
  • Since simplifyDeriv now needs a SkolemInfo and UserTypeCtxt, I have added ds_skol_info and ds_user_ctxt fields to DerivSpec to store these. Similarly, I have also added a denv_skol_info field to DerivEnv, which ultimately gets used to initialize the ds_skol_info in a DerivSpec.

Fixes #20719 (closed).

Edited by Ryan Scott

Merge request reports