## Add saturated constructor applications to Core

A long-standing performance bug in GHC is its behaviour on nested tuples.

- The poster-child case is #5642.
- The problem is that when you have a nested tuple application, e.g.
`((4,True),('c','d'))`

, the size of the type arguments grows quadratically with the expression size. This bites badly in some approaches to generic programming, which use nested tuples heavily. - It's explained in detail in Section 2.3 of Scrap your type applications.

The same paper describes a solution, a modification of System F called System IF. It's neat, and I did once implement in in GHC. But it was quite complicated; most of the time the win was not big; and I don't know how it'd interact with casts, coercions, and type dependency in the latest version of GHC's Core.

So here's another idea, which came up in conversation at ICFP'16: **add a staturated constructor application to Core**. So Core looks like

```
data Expr v = Var v
| App (Expr v) (Expr v)
...
| ConApp DataCon [Expr v] -- NEW
```

Now I hate the idea of adding a new constructor to Core; I often brag about how few constructors it has. But the idea is this:

- A
`ConApp`

is always saturated; think of it as an uncurried application. - Every data constructor has a wrapper Id. For simple constructors like
`(:)`

, the wrapper is just a curried version:

`(:) = /\a. \(x:a). \(y:[a]). (:) [x,y]`

where the "

`(:) [x,y]`

" is just my concrete syntax for a`ConApp`

node.

- We are used to having an intro and elim form for each type former. So for
`(->)`

the intro form is`\x.e`

and the elim form is`(e1 e2)`

. For a data type like`Maybe`

, the elim form is`case`

, but what's the intro form?`ConApp`

of course. - A
`ConApp`

mentions the`DataCon`

explicitly, rather than having it buried inside the`IdDetails`

of an`Id`

, which somehow seems more honest.

The proximate reason for doing this in the first place is to allow us to omit type arguments. Consider `Just True`

. Curently we represent this as

` Var "Just" `App` Type boolTy `App` Var "True"`

But with `ConApp`

we can say

` ConApp "Just" [ConApp "True" []]`

because it's easy to figure out the `boolTy`

type argument from the argument. (We don't really have strings there, but you get the idea.)

Can we omit *all* the type arguments? No: we can omit only those that appear free in any of the argument types. That is usally all of them (including existentials) but not always. Consider:

```
data (,) a b where
(,):: forall a b. a -> b -> (a,b)
data [] a where
[] :: forall a. [a]
(:) :: forall a. a -> [a] -> [a]
data Either a b where
Left :: forall a b. a -> Either a b
Right :: forall a b. b -> Either a b
data (:=:) a b where
Refl :: forall a b. (a~b) -> :=: a b
data Foo where
MkFoo :: a -> (a -> Int) -> Foo
```

For all of these data constructors except `[]`

(nil), `Left`

and `Right`

we can omit all the type arguments, because we can recover them by simple matching against the types of the arguments. A very concrete way to think about this is how to implement

```
exprType :: Expr Id -> Type
exprType (Var v) = varType v
exprType (Lam b e) = mkFunTy (varType b) (exprType e)
exprType (App f a) = funResultTy (exprType f)
...
exprType (ConApp con args) = mkTyConApp (dataConTyCon con) ???
```

We know that the result type of type of a `ConApp`

will be `T t1 ..tn`

where `T`

is the parent type constructor of the `DataCon`

. But what about the (universal) type args `???`

? We can get them from the types of the arguments `map exprType args`

:

- For
`ConApp "(:)" [e1, e2]`

, the type argument is just`exprType e1`

. - For
`ConApp "(:=:)" [e]`

", we expect`exprType e`

to return a type looking like`TyConApp "~" [t1, t2]`

. Then`t1`

and`t2`

are the types we want. So matching is required. - What about an application of
`Left`

?? We need to recover two type args, but`exprType e1`

gives us only one. So we must retain the other one in the application:`ConApp "Left" [Type ty2, e1]`

. Similarly for the empty list.

A simple once-and-for-all analysis on the `DataCon`

will establish how to do the matching, which type args to retain, etc.

Tradeoffs:

- Pro: We can eliminate almost all type args of almost all data constructors; and for nested tuples we can eliminate all of them.
- Pro: it's elegant having the intro/elim duality.
- Pro: in GHC we often ask "is this expression a saturated constructor application?" (see
`exprIsConApp_maybe`

) and`ConApp`

makes it easier to answer that question. - Pro: we do exactly this in types: we have
`AppTy`

and`TyConApp`

. (In types a`TyConApp`

is not required to be saturated, but we could review that choice.) - Con: adding a constructor is a big deal. In lots of places we'd end up saying

```
f (App e1 e2) = App (f e1) (f e2)
f (ConApp c es) = ConApp c (map f es)
```

In the olden days GHC's

`App`

had multiple arguments and the continual need to work over the list was a bit tiresome. Still`ConApp`

is very simple and uniform; quite often adding`map`

won't be difficult; and it may well be that constructors need to be treated differently anyway.

- Con: it's not a general solution to the type-argument problem. See #9198 for example. System IF is the only general solution I know, but it seems like too big a sledgehammer.

We'll only know if we try it. I estimate that it would take less than a week to work this change through all of GHC. 90% of it is routine.

Other possibly-relevant tickets are #8523, #7428, #9630 (closed).