Encode refutable shape information in `PmOracle`
Previously, we had an elaborate mechanism for selecting the warnings to
generate in the presence of different COMPLETE matching groups that,
albeit finely-tuned, produced wrong results from an end user's
perspective in some cases (#13363 (closed)).
The underlying issue is that at the point where the ConVar case has to
commit to a particular COMPLETE group, there's not enough information
to do so and the status quo was to just enumerate all possible complete
sets nondeterministically. The getResult function would then pick the
outcome according to metrics defined in accordance to the user's guide.
But crucially, it lacked knowledge about the order in which affected
clauses appear, leading to the surprising behavior in #13363 (closed).
In !1010 (closed) we taught the term oracle to reason about literal values a
variable can certainly not take on. This MR extends that idea to
ConLikes and thereby fixes #13363 (closed): Instead of committing to a
particular COMPLETE group in the ConVar case, we now split off the
matching constructor incrementally and record the newly covered case as
a refutable shape in the oracle. Whenever the set of refutable shapes
covers any COMPLETE set, the oracle recognises vacuosity of the
uncovered set.
This patch goes a step further: Since at this point the information
in value abstractions is merely a cut down representation of what the
oracle knows, value abstractions degenerate to a single Id, the
semantics of which is determined by the oracle state Delta.
Value vectors become lists of [Id] given meaning to by a single
Delta, value set abstractions (of which the uncovered set is an
instance) correspond to a union of Deltas which instantiate the
same [Id] (akin to models of formula).
Reviewer's guide
Best followed in the order points appear.
PmExpr
- In analogy to
AltCon, I introducedPmAltConto subsumePmExprConandPmExprLitofPmExprin a single case. -
PmExpris gone. We encode the remaining two cases (PmExprCon,PmExprVar) inPmOracle.TmCtinstead. - The equality check between literals/
ConLikeis actually undecidable in some cases (seeeqPmAltConsandNote [Undecidable Equality for PmAltCons]), and rather than just returningFalsein those cases (like whateqPmLitdid), we really want to make explicit when we are sure when things are different. So:Equal :: PmEqualitymeans definitely equal,Disjointmeans definitely different.PossiblyOverlapmeans unsure. - Correctly handling undecidable equality revealed a bug in the fix for #14546 (closed). I added
T14546das a regression test demonstrating how we lossily convert some literals to overloaded literals on HEAD for no reason. - Since
Check.translatePatnow works by desugaring to Core (see the later section inCheckfor why) and then recovering possible literals and constructor applications, we need a functionliteralToPmLitthat can turn CoreLiterals intoPmLits. RHS ofPmGrds are now plainCoreExprs, which we make sense of by feeding them to the oracle withaddVarCoreCt. - Now, we could do
type PmLit = Literal. But that doesn't work, becauseLiteralhas no notion of overloaded literals or String literals and the literals are all defined in terms of primitive types (Int#) instead of the user-facing types (Int). So: Define a clone ofLiteral, sitting between Source Haskell and Core and un-desugarLiteralsinliteralToPmLit.
PmOracle
- Due to various reasons (for example, deciding whether a COMPLETE set is overlapped needs type info in addition to which constructors were already matched), most operations on the term oracle also need access to the type oracle now. That's why I merged the type oracle and many operations related to void type from
CheckintoTmOracleand renamed it toPmOracle. - Central to this patch are the changes to the term oracle. Ergo, understanding the representation of its state
TmStateis a good starting point. It stores everything it knows about a variable ('value abstraction') in aVarInfo, so it's basically just a (shared)IdEnv VarInfo. The shared bit (seeSharedDIdEnv) comes from the fact that we want to represent i.e.xandyby the sameVarInfoif we get to know thatx ~ y. -
VarInfocarries two essential fields: One for positive information (i.e.,xisJust 5) and one for negative information (i.e.,xcan't beLeft, because we matched on it earlier). - Now is a good time to understand
Note [The Pos/Neg invariant], thenNote [TmState invariants]. This should be enough to grok the sectionTerm oracle unification procedureandaddRefutableAltCon - Appreciate the new definition of
addVarVarCtandaddVarConCt(formerlyunify), where all cases are exhaustive now. - Note that adding new type constraints (for example in a GADT match) makes it necessary to prove that every
VarInfois still inhabited by at least one constructor. That's what the call toensureAllPossibleMatchesInhabitedintyIsSatisfiableis about. - The 3 independent tests in
pmIsSatisfiable(ty oracle, term oracle and strict fields) are now split into their ownSatisfiabilityChecknewtype, which different call sites compose as they see fit. - The mechanic surrounding
vi_cacheis worth mentioning. Whenever we test whether a COMPLETE set is completely overlapped, we have to take the set difference withvi_neg. When we do this repeatedly, it becomes pretty expensive for large enums, like in #14667 (closed). So we cache the thinned out COMPLETE sets inVarInfo, which is always a superset of the possible matches. Hence we call the associated data structurePossibleMatches. The only invariant surrounding it is that it has no overlap withvi_neg. Note that this is purely a performance optimisation.
Check
- The first thing to note is that the
PmNLitcase ofPmPatis gone: We encode that knowledge in the oracle now. Also,PmPatis used for patterns exclusively now, since value abstractions are only variables. This meanstype ValVec = [Id]and also that these variables never change throughout the checking process, it's just the oracle stateDeltathat is gradually split and refined. So the uncovered set becomes a union ofDeltas,type Uncoverd = [Delta]. - Since we detect an overlapped COMPLETE set immediately when it is overlapped (by
vi_neg), we generate much better warnings and don't need to commit to a particular COMPLETE set non-deterministically when encountering the first constructor, like we used to do. Consequently, we don't needListTanymore andProvenanceof a COMPLETE set is irrelevant. - We found that negative information (i.e.
failed to match x where x is not one of {True}) is not so great from a UX perspective. In addition, it's hard to tell if aDeltais still satisfiable without searching for an inhabitant (which basically is a modeldeltawhere every variable reachable from a given value vector only has positive information attached to it). So we turn a delta into N positive inhabitants by callingPmOracle.provideEvidenceForEquation. - There are two performance hacks for coping with guards (See item 6 of
Note [Guards and Approximation]) and exponential behavior resulting from ConVar splitting on pattern synonyms (seeNote [Limit the number of refinements]) -
PmPpris for user-facing warning messages. Consequently, we can use theOutputableinstances for debugging purposes.