Scrap Your Type Applications

I just ended an always-fun call with @sweirich with a nice little idea to try out: scrap some type applications.

Important Note: I have not re-read Barry Jay and @simonpj's Scrap Your Type Applications in preparation here. Just cheekily re-using their inspiring title.

The key observation is this: Suppose we have a data constructor K :: forall a b. ty1 -> ty2 -> ty3 -> T a b (no existentials, context, or equalities yet, though they can be accommodated, too, as we'll see). If you know that K is applied to all of its term-level arguments e1, e2, and e3 with known types, then, depending on the shape of ty1, ty2, and ty3, you can often know the type of the applied data constructor. Examples:

(,) e1 e2 :: Tuple2 typeof(e1) typeof(e2)
Just e :: Maybe typeof(e)
(:) e1 e2 :: List typeof(e1)
ReaderT f :: ReaderT r m a
  where (r -> m a) = typeof(f)

This does not always work, of course. Examples:

[] :: List ???
Nothing :: Maybe ???
Proxy :: Proxy ???
MkT e1 e2 :: T ??? typeof(e2)
  (where MkT :: F a -> b -> T a b and F is a non-injective type family)

Introduce a new constructor for Expr b, ConApp :: DataCon -> [Type] -> [Expr b] -> Expr b. This constructor is used to represent saturated data constructor applications.
For every data constructor, we do an analysis of its type variables: those that are scrappable (that is, can be deduced from the type of term-level arguments) and those that are mandatory. ("required" is taken) Scrappable type variables are ones that appear in an injective position in at least one term-level argument type.
The types supplied in a ConApp are only the mandatory ones.
The DataCon datatype will contain a new field dcGetTypeParameters :: [Type] -> [CoreExpr] -> [Type] that, given the mandatory type arguments and the expression arguments, figures out the scrappable type parameters. This field will be used in exprType.

This plan seems relatively simple, both to describe and to implement. The new ConApp form could, at first, be used in the bodies of data constructor wrappers, but we could look for other spots to introduce it -- perhaps in the simplifier. Note that (for now, at least) there would be no invariant saying all saturated data constructors are ConApps; there doesn't seem to be a concrete advantage to such a requirement. And this approach does not deal with intervening casts, which would remain as App chains.

For (4), we would have to do some matching. But, actually, the function to get the type parameters could always be some kind of extraction function that would e.g. get the r, m and a out from r -> m a. If we determine that using the pure matcher is too slow, we could always write some custom code to produce the dcGetTypeParameters function.

What do we think? Is it worth trying this as an experiment?

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information