Scrap Your Type Applications
I just ended an always-fun call with @sweirich with a nice little idea to try out: scrap some type applications.
Important Note: I have not re-read Barry Jay and @simonpj's Scrap Your Type Applications in preparation here. Just cheekily re-using their inspiring title.
The key observation is this: Suppose we have a data constructor K :: forall a b. ty1 -> ty2 -> ty3 -> T a b
(no existentials, context, or equalities yet, though they can be accommodated, too, as we'll see). If you know that K
is applied to all of its term-level arguments e1
, e2
, and e3
with known types, then, depending on the shape of ty1
, ty2
, and ty3
, you can often know the type of the applied data constructor. Examples:
(,) e1 e2 :: Tuple2 typeof(e1) typeof(e2)
Just e :: Maybe typeof(e)
(:) e1 e2 :: List typeof(e1)
ReaderT f :: ReaderT r m a
where (r -> m a) = typeof(f)
This does not always work, of course. Examples:
[] :: List ???
Nothing :: Maybe ???
Proxy :: Proxy ???
MkT e1 e2 :: T ??? typeof(e2)
(where MkT :: F a -> b -> T a b and F is a non-injective type family)
Powered by this observation, I propose the following:
- Introduce a new constructor for
Expr b
,ConApp :: DataCon -> [Type] -> [Expr b] -> Expr b
. This constructor is used to represent saturated data constructor applications. - For every data constructor, we do an analysis of its type variables: those that are scrappable (that is, can be deduced from the type of term-level arguments) and those that are mandatory. ("required" is taken) Scrappable type variables are ones that appear in an injective position in at least one term-level argument type.
- The types supplied in a
ConApp
are only the mandatory ones. - The
DataCon
datatype will contain a new fielddcGetTypeParameters :: [Type] -> [CoreExpr] -> [Type]
that, given the mandatory type arguments and the expression arguments, figures out the scrappable type parameters. This field will be used inexprType
.
This plan seems relatively simple, both to describe and to implement. The new ConApp
form could, at first, be used in the bodies of data constructor wrappers, but we could look for other spots to introduce it -- perhaps in the simplifier. Note that (for now, at least) there would be no invariant saying all saturated data constructors are ConApp
s; there doesn't seem to be a concrete advantage to such a requirement. And this approach does not deal with intervening casts, which would remain as App
chains.
For (4), we would have to do some matching. But, actually, the function to get the type parameters could always be some kind of extraction function that would e.g. get the r
, m
and a
out from r -> m a
. If we determine that using the pure matcher is too slow, we could always write some custom code to produce the dcGetTypeParameters
function.
What do we think? Is it worth trying this as an experiment?