# Encode refutable shape information in `PmOracle`

Previously, we had an elaborate mechanism for selecting the warnings to
generate in the presence of different `COMPLETE`

matching groups that,
albeit finely-tuned, produced wrong results from an end user's
perspective in some cases (#13363 (closed)).

The underlying issue is that at the point where the `ConVar`

case has to
commit to a particular `COMPLETE`

group, there's not enough information
to do so and the status quo was to just enumerate all possible complete
sets nondeterministically. The `getResult`

function would then pick the
outcome according to metrics defined in accordance to the user's guide.
But crucially, it lacked knowledge about the order in which affected
clauses appear, leading to the surprising behavior in #13363 (closed).

In !1010 (closed) we taught the term oracle to reason about literal values a
variable can certainly not take on. This MR extends that idea to
`ConLike`

s and thereby fixes #13363 (closed): Instead of committing to a
particular `COMPLETE`

group in the `ConVar`

case, we now split off the
matching constructor incrementally and record the newly covered case as
a refutable shape in the oracle. Whenever the set of refutable shapes
covers any `COMPLETE`

set, the oracle recognises vacuosity of the
uncovered set.

This patch goes a step further: Since at this point the information
in value abstractions is merely a cut down representation of what the
oracle knows, value abstractions degenerate to a single `Id`

, the
semantics of which is determined by the oracle state `Delta`

.
Value vectors become lists of `[Id]`

given meaning to by a single
`Delta`

, value set abstractions (of which the uncovered set is an
instance) correspond to a union of `Delta`

s which instantiate the
same `[Id]`

(akin to models of formula).

# Reviewer's guide

Best followed in the order points appear.

##
`PmExpr`

- In analogy to
`AltCon`

, I introduced`PmAltCon`

to subsume`PmExprCon`

and`PmExprLit`

of`PmExpr`

in a single case. -
`PmExpr`

is gone. We encode the remaining two cases (`PmExprCon`

,`PmExprVar`

) in`PmOracle.TmCt`

instead. - The equality check between literals/
`ConLike`

is actually undecidable in some cases (see`eqPmAltCons`

and`Note [Undecidable Equality for PmAltCons]`

), and rather than just returning`False`

in those cases (like what`eqPmLit`

did), we really want to make explicit when we are*sure*when things are different. So:`Equal :: PmEquality`

means definitely equal,`Disjoint`

means definitely different.`PossiblyOverlap`

means unsure. - Correctly handling undecidable equality revealed a bug in the fix for #14546 (closed). I added
`T14546d`

as a regression test demonstrating how we lossily convert some literals to overloaded literals on HEAD for no reason. - Since
`Check.translatePat`

now works by desugaring to Core (see the later section in`Check`

for why) and then recovering possible literals and constructor applications, we need a function`literalToPmLit`

that can turn Core`Literal`

s into`PmLit`

s. RHS of`PmGrd`

s are now plain`CoreExpr`

s, which we make sense of by feeding them to the oracle with`addVarCoreCt`

. - Now, we could do
`type PmLit = Literal`

. But that doesn't work, because`Literal`

has no notion of overloaded literals or String literals and the literals are all defined in terms of primitive types (`Int#`

) instead of the user-facing types (`Int`

). So: Define a clone of`Literal`

, sitting between Source Haskell and Core and un-desugar`Literals`

in`literalToPmLit`

.

##
`PmOracle`

- Due to various reasons (for example, deciding whether a COMPLETE set is overlapped needs type info in addition to which constructors were already matched), most operations on the term oracle also need access to the type oracle now. That's why I merged the type oracle and many operations related to void type from
`Check`

into`TmOracle`

and renamed it to`PmOracle`

. - Central to this patch are the changes to the term oracle. Ergo, understanding the representation of its state
`TmState`

is a good starting point. It stores everything it knows about a variable ('value abstraction') in a`VarInfo`

, so it's basically just a (shared)`IdEnv VarInfo`

. The shared bit (see`SharedDIdEnv`

) comes from the fact that we want to represent i.e.`x`

and`y`

by the same`VarInfo`

if we get to know that`x ~ y`

. -
`VarInfo`

carries two essential fields: One for positive information (i.e.,`x`

is`Just 5`

) and one for negative information (i.e.,`x`

can't be`Left`

, because we matched on it earlier). - Now is a good time to understand
`Note [The Pos/Neg invariant]`

, then`Note [TmState invariants]`

. This should be enough to grok the section`Term oracle unification procedure`

and`addRefutableAltCon`

- Appreciate the new definition of
`addVarVarCt`

and`addVarConCt`

(formerly`unify`

), where all cases are exhaustive now. - Note that adding new type constraints (for example in a GADT match) makes it necessary to prove that every
`VarInfo`

is still inhabited by at least one constructor. That's what the call to`ensureAllPossibleMatchesInhabited`

in`tyIsSatisfiable`

is about. - The 3 independent tests in
`pmIsSatisfiable`

(ty oracle, term oracle and strict fields) are now split into their own`SatisfiabilityCheck`

newtype, which different call sites compose as they see fit. - The mechanic surrounding
`vi_cache`

is worth mentioning. Whenever we test whether a COMPLETE set is completely overlapped, we have to take the set difference with`vi_neg`

. When we do this repeatedly, it becomes pretty expensive for large enums, like in #14667 (closed). So we cache the thinned out COMPLETE sets in`VarInfo`

, which is always a superset of the possible matches. Hence we call the associated data structure`PossibleMatches`

. The only invariant surrounding it is that it has no overlap with`vi_neg`

. Note that this is purely a performance optimisation.

##
`Check`

- The first thing to note is that the
`PmNLit`

case of`PmPat`

is gone: We encode that knowledge in the oracle now. Also,`PmPat`

is used for patterns exclusively now, since value abstractions are only variables. This means`type ValVec = [Id]`

and also that these variables*never change*throughout the checking process, it's just the oracle state`Delta`

that is gradually split and refined. So the uncovered set becomes a union of`Delta`

s,`type Uncoverd = [Delta]`

. - Since we detect an overlapped COMPLETE set immediately when it is overlapped (by
`vi_neg`

), we generate much better warnings and don't need to commit to a particular COMPLETE set non-deterministically when encountering the first constructor, like we used to do. Consequently, we don't need`ListT`

anymore and`Provenance`

of a COMPLETE set is irrelevant. - We found that negative information (i.e.
`failed to match x where x is not one of {True}`

) is not so great from a UX perspective. In addition, it's hard to tell if a`Delta`

is still satisfiable without searching for an inhabitant (which basically is a model`delta`

where every variable reachable from a given value vector only has positive information attached to it). So we turn a delta into N positive inhabitants by calling`PmOracle.provideEvidenceForEquation`

. - There are two performance hacks for coping with guards (See item 6 of
`Note [Guards and Approximation]`

) and exponential behavior resulting from ConVar splitting on pattern synonyms (see`Note [Limit the number of refinements]`

) -
`PmPpr`

is for user-facing warning messages. Consequently, we can use the`Outputable`

instances for debugging purposes.