|
|
## Motivation
|
|
|
|
|
|
Without orphans, the dependency graph suffers from linearization: the richer class hierarchy we have, the more we approach a total order where every package is either imported or imports every other. This is catastrophic for productivity.
|
|
|
Without orphans, the dependency graph suffers from linearization: the richer class hierarchy we have, the more we approach a total order where every package is either imported or imports every other. This is catastrophic for productivity.
|
|
|
|
|
|
The great thing about open source, and open source with Haskell in particular, is how much the product of our labor --- the code --- itself synchronizes the laborers --- us. The type systems allows information that humans could never communicate in a "game of telephone" to losslessly flow from package to package, all throughout the dependency tree. But rather than forcing all releases to be tightly coordinated, types and version constraint solving frees us to release libraries fairly independently.
|
|
|
|
... | ... | @@ -42,13 +42,14 @@ The "world semantics" from Chapter 1, section 4 of [1] clarify the situation imm |
|
|
|
|
|
The only quibble I might add is that the multi-param type class example of non-orphans gone wrong can be construed as a semi-orphan: because it only occurs when the type class parameters have instance heads that *wouldn't* pass the orphan checks were they the sole parameter. A stronger orphan check would have required a local-type-guarded argument for every parameter for a non-local type class, and that, while super restrictive, would solve the problem.
|
|
|
|
|
|
Why isn't this approache used in the implementation already? The challenge is that a naive implementation has rather bad asymptotic complexity, as every pair of instances across every pair of imports needs to be checked. Even with the more obvious optimization of only comparing modules that have not been previously compared, that's still O(n^2) edges of the complete bipartite graph.
|
|
|
Why isn't this approach used in the implementation already? The challenge is that a naive implementation has rather bad asymptotic complexity, as every pair of instances across every pair of imports needs to be checked. Even with the more obvious optimization of only comparing modules that have not been previously compared, that's still O(n^2) edges of the complete bipartite graph.
|
|
|
|
|
|
### Rust
|
|
|
|
|
|
In [2], Aaron Turon, one of the core rust designers (as in the ones with the PhDs signing off the theory) makes the connection between modal logic and type classes more specific. For a really quick summary:
|
|
|
|
|
|
- In both Rust and Haskell today, while the instances themselves are monotonic, the consistency checks aren't and cannot be.
|
|
|
|
|
|
- Rust prohibits orphans completely, unlike Haskell, but has much more complicated and subtle orphan checks, sometimes observing whether the upstream instances cover as much as they could, to compensate.
|
|
|
|
|
|
I am not advocating for the Rust current solution, but I do want to credit them for exploring this path. It's precisely the non-monotonicity of the consistency checks that opens the door to a vast and rich design space, and and as far as I know they recognized it first.
|
... | ... | @@ -58,12 +59,7 @@ More recently, the Rust folks have also started questioned prohibiting orphans t |
|
|
|
|
|
## Solution
|
|
|
|
|
|
### More flexible instance visibilty
|
|
|
|
|
|
This the first ingredient.
|
|
|
See the [Instance Visibility](./Instance-Visibility.md) page for further details.
|
|
|
|
|
|
### Meet module
|
|
|
#### Reconsidering the non-orphan condition with order theory
|
|
|
|
|
|
The proposed solution is to continue leaning into order theoretic-solutions, letting what's been the basis for the overall good properties of the ecosystem also solve this problem.
|
|
|
|
... | ... | @@ -74,27 +70,149 @@ Ignoring the minutiae of judging instance heads, the general idea of an instance |
|
|
|
|
|
This bares a strong resemblance to properties a property of a lattice meet:
|
|
|
|
|
|
> x ≤ y => (x ∧ y = x, y ∧ x = x)
|
|
|
> x ∧ y = x => x ≤ y
|
|
|
|
|
|
> x ∧ y = y => y ≤ x
|
|
|
|
|
|
Think of it this way:
|
|
|
|
|
|
- A and B are modules where types/classes in the instance head are defined, as described above.
|
|
|
- A ∧ B is the module the instance is in.
|
|
|
- A ≤ B means A transitively imports B, or A = B.
|
|
|
|
|
|
With these definitions, the cases of A ∧ B being either A or B cover exactly the non-orphan conditions we started with.
|
|
|
|
|
|
With the "Flexible Instances" extension, instance heads can be more than a single class applied to a single type.
|
|
|
This opens the door to instances heads involving items defined in more than 2 modules.
|
|
|
This is no problem.
|
|
|
While the binary ["biased"](https://ncatlab.org/nlab/show/biased+definition) version of "meet" is best known, the unbiased n-ary version is just as valid.
|
|
|
|
|
|
Now, the notion of meets applies to order elements that are incomparable --- indeed that is it's whole point! --- so the next question to ask is, what does such a meet look like in the Haskell context and what does it mean for coherence?
|
|
|
- A... are modules where types/classes in the instance head are defined
|
|
|
- ∧(A...) is the module the instance is in, by assumption (non-orphan) ≤(A...) is one of A...
|
|
|
- for all a in A..., ∧(A...) ≤ a
|
|
|
|
|
|
This even works for an empty A...:
|
|
|
if the class and types are all magic built-ins, then the instance better also be a magic built-in!
|
|
|
|
|
|
### Meet module
|
|
|
|
|
|
Now, the notion of meets applies to order elements that are incomparable --- indeed that is it's whole point! --- so the next question to ask is, what does such a meet (the general case) look like in the Haskell context and what does it mean for coherence?
|
|
|
|
|
|
The results are good: If A ∧ B is exists, and isn't A or B, it is a third module which:
|
|
|
|
|
|
- imports A and B
|
|
|
- is imported by ever other module also (transitively) importing A and B
|
|
|
- transitively imports A and B
|
|
|
- is itself transitively imported by ever other module that also (transitively) imports A and B
|
|
|
|
|
|
Since neither of A and B imports the other, we know only variable-head non-orphan "flexible" instances could overlap with our prospective instance in A ∧ B. That's a much more limited set of things to check for. Then, since every other module which could also write this instance imports A ∧ B, we can just use our existing per-instance check to ensure they don't!
|
|
|
|
|
|
### Module Meet canonicity?
|
|
|
This is especially interesting when considering non-binary meets.
|
|
|
Suppose we have 3 modules without orphans, A, B, and C, none of which transitively import any other
|
|
|
Assuming we have orphan instances using these in all possible combinations, what we end up with is:
|
|
|
|
|
|
```graphviz
|
|
|
digraph G {
|
|
|
|
|
|
A -> "∧(A, B)";
|
|
|
B -> "∧(A, B)";
|
|
|
B -> "∧(B, C)";
|
|
|
C -> "∧(B, C)";
|
|
|
A -> "∧(A, C)";
|
|
|
C -> "∧(A, C)";
|
|
|
|
|
|
"∧(A, B)" -> "∧(A, B, C)";
|
|
|
"∧(B, C)" -> "∧(A, B, C)";
|
|
|
"∧(A, C)" -> "∧(A, B, C)";
|
|
|
}
|
|
|
```
|
|
|
|
|
|
This is the free [semilattice](https://en.wikipedia.org/wiki/Semilattice) on 3 incomparable elements.
|
|
|
The graphviz above is for a [Hasse Diagram](https://en.wikipedia.org/wiki/Hasse_diagram), which is to say the arrows are not reflexi-transitive imports, but actual regular "direct" imports.
|
|
|
|
|
|
Free semilattices are also defined for posets that aren't just n incomparable elements.
|
|
|
These too represent the module graphs for the maximal orphan instances starting with modules that aren't completely independent (the general case).
|
|
|
|
|
|
To recap this section, the motion of a meet module finally gives us a powerful tool to indicate where an instance may safely reside.
|
|
|
And it works for both orphans and non-orphans alike.
|
|
|
The math is essentially telling us: don't think about orphans vs non-orphans, instead think about safe vs unsafe instances, where a "safe instance" is defined as an "instance that is defined in the meet module of the items in the instance head".
|
|
|
Safe instances are all of the non-orphans along with the "good" orphans.
|
|
|
And safe instances are enough for much more decoupling and modularity than today.
|
|
|
|
|
|
### Declaring versus inferring meet modules
|
|
|
|
|
|
But how do we know which module is A ∧ B? Checking the laws after the fact is:
|
|
|
|
|
|
- Non-local, and impossible even in the open-world context packages are written
|
|
|
- Inefficient even if we bound the check somehow
|
|
|
|
|
|
Instead, we can *pre-declare* meets.
|
|
|
Instead, we can declare meets *by fiat*.
|
|
|
This means, instead of just *inferring* a module is a meet once we have our entire build plan
|
|
|
(all the modules down to `Main`, and thus we have our our module partial order),
|
|
|
we *require* ahead of knowing the entire partial order that some module must be the meet.
|
|
|
As more downstream code is considered, extending the module graph and partial order, it must continue to abide by all declared-by-fiat meets.
|
|
|
|
|
|
This gets us back from closed world to open world thinking, as is essential for library development.
|
|
|
A module containing an instance must be the meet module of the modules defining items in the instance head.
|
|
|
We know this requirement by virtue of the instance existing, local reasoning alone.
|
|
|
It is thus very practical to require it up front; downstream code that would violate the meet property simply can't be used with this instance.
|
|
|
|
|
|
### Injecting imports
|
|
|
|
|
|
Taken literally, the above requirements mean some imports might need to be written that aren't otherwise needed.
|
|
|
For example, suppose we have these 3 modules:
|
|
|
|
|
|
```haskell
|
|
|
module A where
|
|
|
class Foo x
|
|
|
|
|
|
module B where
|
|
|
data Bar
|
|
|
|
|
|
-- | Declared to be A and B meet module
|
|
|
module AB where
|
|
|
instance Foo Bar
|
|
|
```
|
|
|
|
|
|
Another module, call it `C`, that imported `A` and `B` would *have* to import `AB`, or else the meet condition would be violated:
|
|
|
`C` would be "in competition" with `AB` into which one of them was *real* meet, since neither imported the other.
|
|
|
This is very weird if `C` has no need for `instance Foo Bar`!
|
|
|
|
|
|
> Taking some inspiration from the category theory generalization of meets, we could have `AB` reexport all of `A and `B`, and then `C` can import `AB` instead of `A` an `B`, rather than in addition to.
|
|
|
> But this is also clunky.
|
|
|
> Furthermore, it doesn't even work in general if `A` and `B` have overlapping names.
|
|
|
> For that, we would need `C.A` and `C.B` hypothetical hierarchy modules, a feature which does not exist today.
|
|
|
> And it would still be clunky.
|
|
|
|
|
|
Here is an alternative
|
|
|
Instead of forcing all these `import`s to be written down, we can just inject them:
|
|
|
Any module that imports `A` and `B`, and isn't itself `AB`, will get an implicit `AB` import.
|
|
|
This forces any "pretenders" to acknowledge `AB` as the one true (designated) meet module.
|
|
|
And for other modules it is a harmless extra import.
|
|
|
|
|
|
(These inferred imports can be instances only, `import AB ()` in today's syntax.
|
|
|
Other items don't matter, we only care about instances because we only care about coherence.)
|
|
|
|
|
|
Still, this isn't ideal.
|
|
|
It makes it too easy to accidentally use the `Foo Bar` instance when this was not intended.
|
|
|
And it opens a can of worms for Cabal etc., since we are talking implicit dependencies.
|
|
|
(Say there are multiple modules "in the wild" that could be used and declared the meet module for a given project.
|
|
|
And then code is depending on both the existence and choice of the meet module without writing any sort of `import`/`build-depends` down.
|
|
|
Chaos!)
|
|
|
|
|
|
The solution to this is a more subtle notion [Instance Visibility](./Instance-Visibility.md) than we have today.
|
|
|
\[I, @ericson2314, came up with that for this, but gave it its own page is it is independantly useful.\]
|
|
|
When we force imports to a meet, we don't want to give modules more instances to *use*.
|
|
|
We just want to make sure they don't define instances that would conflict with the instances in the meet module.
|
|
|
In the language of the [Instance Visibility](./Instance-Visibility.md) page, that means we don't want to force-import the *instances* just the *burdens* (resp. not privileges, just responsibilities).
|
|
|
|
|
|
This mean instead of injecting `import AB` or `import AB ()`, we inject an even more minimal import (no syntax yet exists) which gives the downstream nothing but the obligation to not write anything that overlaps with `AB`'s `Foo Bar` instance.
|
|
|
And here's the kicker, whereas the choice of instance for `AB` is not necessary cannonical, the coherence obligation is.
|
|
|
In other words, it doesn't matter what is inside `AB`'s instance.
|
|
|
This is very good if there are are multiple competing 3rd-party meet modules (3rd party in the sense of written without the blessing, acknowledgement, or even awareness of the authors of hte underlying modules).
|
|
|
We want code that doesn't depend on this 3rd party stuff to work with any such 3rd party stuff --- if you don't need the `Foo Bar` intance, and so, you shouldn't start using it by accident, and you shouldn't care which `Foo Bar` instance other code that needs it chooses.
|
|
|
|
|
|
### Module meet canonicity?
|
|
|
|
|
|
Now, this does sound rather compositional. Have we just transformed a type class consistency problem into some other consistency problem? Well yes, but at least we've boiled down the rather complex question of consistency checks into something much simpler.
|
|
|
Declaring meet modules by fiat does sound rather non-compositional. Have we just transformed a type class consistency problem into some other consistency problem? Well yes, but at least we've boiled down the rather complex question of consistency checks into something much simpler.
|
|
|
|
|
|
And indeed there is precedent for such an approach: rather than writing down complicated propositions about who provides what module, module interfaces, etc., we just use a black box version number.
|
|
|
|
... | ... | @@ -104,7 +222,7 @@ This is an apt comparison, because meet declarations should live in Cabal files. |
|
|
|
|
|
In practice, @Ericson2314 doesn't anticipate contention over who implements what. Rather than today, where packages begrudgingly and shamefully implement the orphans they need, I expect packages whose *sole purpose* is implementing orphans to be created, and people to collaborate on those packages rather than getting into packaging wars over who implements what instance.
|
|
|
|
|
|
### Incremental verification
|
|
|
### Incremental verification
|
|
|
|
|
|
What about checks? The more obvious ones are
|
|
|
|
... | ... | |