jstolarek · 0ed9979f
--- a/injective-type-families.md
+++ b/injective-type-families.md
@@ -6,8 +6,7 @@ Implementation discussion and progress was recorded in
 [ Phab D202](https://phabricator.haskell.org/D202).
-As of September 2015 injective type families are merged into HEAD and will be
+Injective type families have been merged into HEAD in September 2015 and became available in GHC 8.0 with the `TypeFamilyDependencies` language extension.
-available in the next stable GHC release (8.0) with the `TypeFamilyDependencies` language extension.
 Person responsible for this page is Jan Stolarek (just so you now who is meant
@@ -299,16 +298,12 @@ the injectivity paper discusses this connection.
 ## Future plans and ideas
-### Type C injectivity
+### Type C injectivity (aka. generalized injectivity)
-Our plan for the nearest future is to implement injectivity of type C.  This
+We would like to implement generalized injectivity in GHC.  This would give
-will give injective type families expressive power identical to this of
+injective type families expressive power identical to this of functional
-functional dependencies.
+dependencies. Here are several properties of generalized injectivity annotations:
-If we decied to implement injectivity of type C checking injectivity annotation
-would become more complicated as we would have to check for things like:
 - `type family F a b = r | r -> a, r -> b`. This is technically correct but we
  could just say `result -> a b`.
@@ -326,6 +321,256 @@ would become more complicated as we would have to check for things like:
 - injectivity conditions don't overlap (eg. `result -> a b` overlaps
  `result -> a`). This probably deserves a warning.
+I started implementation work on generalized injectivity in September 2015 but
+didn't finish it. The relevant Trac ticket is [\#10832](https://gitlab.haskell.org//ghc/ghc/issues/10832). Partial implementation
+is available as [ Phab:D1287](https://phabricator.haskell.org/D1287) and on the wip/T10832-generalised-injectivity
+branch. In October/November 2016 I tried to rebase this branch on top of HEAD
+and resume work but the typechecker code has changed significantly and I was
+unable to make it work. It seems that the easiest way here is to salvage parser
+changes and the data type definitions in `HsDecls` and rewrite the rest of the
+patch from scratch.
+We also had some interesting email discussion. I am recording the most
+relevant emails here so they don't get lost and are available to anyone who
+decides to take up the work.
+Richard Eisenber, 16 July 2015:
+>
+> Here is an interesting case:
+>
+> type family F a b = r \| r a -\> b where
+>
+> >
+> > F Int  a = Maybe (F a Int)
+> > F Bool a = \[a\]
+> > F Char a = \[a\]
+>
+> This should be rejected, because of the following facts:
+>
+> >
+> > F Int Bool \~ Maybe \[Int\]
+> > F Int Char \~ Maybe \[Int\]
+>
+>
+> Thus, the first parameter and the result do not, in concert, uniquely determine the second parameter.
+>
+> Yet Simon's "just tuple them up" approach would label F as injective. The only interesting pairwise check is the first equation against itself. We run this:
+> >
+> > U( ( (Int, Maybe (F a Int)), (Int, Maybe (F b Int)) )
+> <table><tr><td>If we are to recur into F via equation (7), then we get a substitution \[a\> b\], and the pairwise check (erroneously) succeeds. Simon said that the arguments to the left of -\> need to be syntactically identical, which this case fails. But now consider this:
+> </td></tr></table>
+>
+> type family G a b = r \| r a -\> b where
+>
+> >
+> > G Int  (a,b) = (Maybe (G a b), a)
+> > G Bool a     = \[a\]
+> > G Char a     = \[a\]
+> >
+> > U( (Int, (Maybe (G a b), a)) , (Int, (Maybe (G c d), c)) )
+> <table><tr><td>Here, it is OK to recur via equation (7), because the presence of a and c outside of the recursive call gets them to unify. Indeed we must recur to accept G, because we need to look under G to get the mapping \[b\> d\]. (The fundep version of F is rejected and the fundep version of G is accepted.)
+> </td></tr></table>
+>
+> I believe this G exemplifies the order-dependency Jan was worried about. We somehow need to unify the second component of that tuple before looking under G.
+>
+> So the question is: when can equation (7) apply?
+>
+> I naturally tried to answer this myself, via comparison with functional dependencies. My guess was that our check is a little more liberal than the fundep check, in that we accept benign overlap between equations. When I pushed this on the fundep side, I discovered that fundeps as implemented in 7.10.1 are broken! Here is the test case:
+```wiki
+{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies,
+            FlexibleInstances, UndecidableInstances #-}
+module FD where
+class H a b r | a b -> r, r a -> b
+instance H Bool Double Bool
+class G a b c r | a b c -> r, r a -> b
+instance H a b r => G Int (a,b) ()   (Maybe r, a)
+instance            G Int (a,a) Char (Maybe a, a)
+foo :: G a b c r => (a,b,c,r)
+foo = undefined
+-- (1): instance G Int (Bool, Bool)   Char (Maybe Bool, Bool)
+x = foo :: (Int, (Bool, Bool), Char, (Maybe Bool, Bool))
+-- (2): instance G Int (Bool, Double) ()   (Maybe Bool, Bool)
+y = foo :: (Int, (Bool, Double), (), (Maybe Bool, Bool))
+```
+>
+> This compiles in 7.10.1. Yet we can see that GHC finds instances (1) and (2) marked above, even though these violate the second functional dependency on G: r and a do \*not\* uniquely determine b. (The c parameter is there simply to avoid overlapping instances.)
+>
+> So perhaps the story \*is\* rather more complicated than we thought, and fundeps' simplicity in this matter is bogus.
+>
+> And, quite separately, I'm also worried about the bare-type-variable-on-the-right problem. But let's solve this before tackling that.
+Simon Peyton Jones, 23rd July 2015:
+>
+> I'm being slow here, but I have learned some things.
+1. As you'll see from "Understanding FDs using CHRs", we have to check 
+  TWO things for FDs:
+> >
+> > class C a b \| a -\> b
+- Coverage: for each instance, individually, check coverage
+> >
+> > instance Q =\> C s t
+>
+>
+> The naïve version just checks that fvs(s) is a superset of fvs(t).
+> The "liberal" coverage condition takes account of Q.
+- Consistency: compare all instances pairwise to check that they are consistent
+> >
+> > instance Q1 =\> C s1 t1
+> > instance Q2 =\> C s2 t2
+>
+>
+> if S = unify(s1,s2), then check that S(t1) = S(t2)
+> Notice that this does not take account of Q1, S2 at all.
+1. In our paper we have ONE check.  But I believe that 
+  - consistency = comparing equations i,j, where i is \*distinct\* from j
+  - coverage    = comparing equation I with itself
+  So that makes our approach rather nicely uniform.
+1. GHC has an outright bug in its handling of consistency,
+  as you both point out.  But it's a bug that seems to be exploited!
+  See Trac [\#10675](https://gitlab.haskell.org//ghc/ghc/issues/10675).
+1. For \*consistency\*, our approach is a little more liberal than FDs,
+  because the FD stuff takes no account of Q1, Q2.  Example
+  (in type-family form)
+> >
+> > F (a,Int)  = (\[G a\],    Maybe Int)
+> > F (Bool,b) = (\[G Bool\], Maybe b)
+>
+>
+> where G is injective, and F is claimed to be.
+> We find that F is injective because, using the injectivity
+> of G, we unify a:=Bool.
+> >
+> > But in FD form
+> >
+> > >
+> > > class F a r \| r -\> a
+> > > instance G a r    =\> F (a,Int) (\[r\], Maybe Int)
+> > > instance G Bool r =\> F (Bool,b) (\[r\], Maybe b)
+> >
+> > These two will be rejected by the FD world because of
+> > ignoring Q1, Q2.
+> >
+> > That said, I don't think this additional expressiveness
+> > is very useful. The BIG thing about our ability to exploit
+> > injectivity comes when comparing an equation with itself i=j.
+1. For \*coverage\*, the "liberal" coverage condition behaves
+  very like our "exploiting injectivity in algorithm U" idea.
+> >
+> > However if we extend to injectivity constraints like
+> >
+> > >
+> > > a r -\> b
+> >
+> >
+> > then, as you both point out, we'll need some kind of
+> > iterative scheme.  Suppose we have
+> >
+> > >
+> > > type family F a b = r \| r a -\> b
+> >
+> >
+> > Then Rule (7) would become something like
+> >
+> > >
+> > > U(F s1 s2, F t1 t2) T = U(s2,t2) T,   if T(s1)=T(t1)
+> >
+> >
+> > and the side condition (syntactic equality) might change 
+> > as unification progresses.
+> >
+> > What happens for coverage for fudeps?  We don't use unification;
+> > we use free variables, as in Defn 7 of the paper.  But crucially
+> > we use an iterative algorithm that iterates to a fixpoint;
+> > see [FunDeps](fun-deps).oclose.
+> >
+> > We would have to do the same, in some form.
+>
+> Questions:
+- FDs uses different technology for the coverage (i/i) and
+  consistency (i/j) tests.  We could do so too; after all,
+  in the i/I case we know that the type structure matches!
+- Do we really care about the extra expressiveness of rule (7)
+  for the (i/j) case?
+Simon Peyton Jones, 24rd July 2015:
+>
+> Jan says (and I can see) that we ALREADY use different 
+> mechanisms for i/j and i/i.
+>
+> So I propose that:
+- We elaborate the i/i (coverage) test to do the iterative 
+  'oclose' thing.  This already does not use algorithm U at all.
+- We simplify algorithm U by dropping Rule (7) altogether.
+  - U is only used for i/j tests
+  - it's hard to craft an example where Rule (7) matters
+  - even such examples that exist are terribly fragile
+    (e.g. in my point (4) if I replace (G Bool) by its
+    values, say Char, then the test will fail)
+Richard Eisenber, 26 July 2015:
+>
+> Simon has come to the same conclusions I have while pondering this all: we differ from FDs in that we cleverly use the same pairwise check between two different equations as we do when comparing an equation against itself. But this is perhaps too clever, and perhaps explains why the proof has foundered around this very issue (that is, equation 7).
+>
+> So, I agree with Simon's proposal: explain all of this in terms of two separate checks. One check does a coverage check (very like the FD coverage check) for each equation, and another pairwise check is used only when i /= j. The Algorithm U stuff stays, but \*without equation 7\*. The proof would have to be tweaked, but I think this will give us a full proof of soundness. (Yay!) And it will all be easier to extend to generalized injectivity annotations.
 ### Inferring injectivity
 [Here](https://gitlab.haskell.org//ghc/ghc/issues/6018) it was suggested by Simon that we could infer