Optimising newtypelike data types with existentials
Motivation
Consider this data type declaration
data T where
MkT :: !(Foo a) > T
So a
is an existentially bound variable, and we cannot use a newtype for T
. So, as things stand, a value of type T
will be represented by a heapallocated box containing a single pointer to a value of type (Foo ty)
for some ty
.
This is tantalising. The extra indirection and allocation gains nothing. Since MkT
is strict in its only argument, we could (at codegen time) represent a value of type T
by a value of type Foo ty
.
This page does not propose any change whatsoever to the language. Rather, it proposes that we guarantee an efficient unboxed representation for certain data types.
The ticket to track this idea is #1965.
See also this OCaml pull request, which does exactly the same kind of thing for OCaml.
Main design
Under what conditions can we guarantee to remove the box for a data type entirely?
 Only one constructor in the data type
 Only one field with nonzero width in that constructor (counting constraints as fields).
 That field is marked strict
 That field has a boxed (or polymorphic) type
Note that an equality constraint arising from a GADT has zero width, and thus is covered by (2). E.g.
data T a where
MkT :: !Int > T Int
The constructor actually has type
MkT :: forall a. (a ~# Int) => Int > T a
So we could represent a value of type (MkT a)
by a plain Int
, without an indirection, because the evidence for (a ~# Int)
is zero width.
You might think that a single constructor GADT is probably not much use, but see Example 2 below.
Some discussion
Superclasses
David F: I believe condition 2 can be relaxed very slightly, to allow constraints known to be zerowidth. For example, equality constraints should be fine. So should classes that have no methods and no superclasses with methods. For example, given
classThis a ~Int=>Foo a
classFoo a =>Bar a wheredataBarType a
classBar a =>Baz a where
prox ::Proxy# a
David F: I imagine that Foo a
, Bar a
, and Baz a
contexts are zerowidth.
SLPJ: no, class constraints are always boxed because they can be bottom (with recursive classes). I don't know how to avoid this.
David F: Does this mean even ~
constraints are boxed? If so, that monkeys with GADT constructors that have equality constraints involving type families.
dataFoo a b whereFoo::TF1 a ~TF2 b =>Foo a b
seems rather nicer than the alternative
dataFoo a b whereFoo::!(TF1 a :~:TF2 b)>Foo a b
Do such constraints just pile on extra junk?
Strictness
Unlike a true newtype
, pattern matching on the constructor must force the contents to maintain type safety. In particular, matching on the constructor reveals an existential and/or type information. As Dan Doel found, and pumpkin relayed in #1965:16, we have to be careful not to reveal such information without forcing the evidence. Since we're using the newtype optimization, the evidence is in the contained field itself.
SLPJ: I do not understand this paragraph. Remember, we propose no change to the source language semantics.
David F: I was simply clarifying that while this is the "newtype optimization", it is *not* about an actual newtype
. In both Haskell and Core it is data
, although Core could flag it as unusual if that's helpful.
Sample uses
Example 1
You might think that an existential data type with only one field is a bit unusual. Here is a example:
dataShape=EmptyNonEmptydataIntMap a = forall (e ::Shape).IntMap!(IMGadt e a)dataIMGadt(e ::Shape) a whereBin::Prefix>Mask>IMGadtNonEmpty a >IMGadtNonEmpty a >IMGadtNonEmpty a
Tip::Key> a >IMGadtNonEmpty a
Nil::IMGadtEmpty a
Here IntMap
obeys (1)(4) and so IntMap ty
could be represented (without indirection) by the underlying (IMGadt e ty)
value, thereby saving an indirection at the root of every IntMap
.
A more direct rendering would look like this
data IntMap a = Empty  NonEmpty (NE a)
data NE a = Bin Prefix Mask (NE a) (NE a)
 Tip Key a
No GADTs, no existentials. But we get an indirection at the root of every nonempty IntMap
.
Example 2
David Feuer: A singleconstructor GADT can add a payload to something like
Refl
; it could also be used with a strict typealigned sequence to "count", layering on length indexing. Admittedly not earthshattering, but not totally useless. SLPJ: I still don't get it. Could you give an example?
For instance,
dataTList c x y whereNil::TList c x x
Cons::!(c x y)>TList c y z >TList c x z
dataNat=ZSNatdataLengthIncrement c p q whereInc::!(c x y)>LengthIncrement c '(S n, x)'(n, y)
Now TList (LengthIncrement c) '(m, x) '(n, y)
represents a typealigned list taking a path from x
to y
, and having a length of m  n
. So while a singleconstructor GADT may not be much use on its own, it can do something interesting when combined with another, multiconstructor GADT!
Example 3
AntC: For anonymous records, you can wrap a payload in a newtype to label it within the record. But if you want to restrict (say) PersonId to being an Int, and yet have it look polymorphic, you have to make that a GADT. So now you're paying for the box.
For instance,
newtype Label1 a = Label1 a deriving (Eq, Read, Show)
instance GetLabel (Label1 a) a where getLabel (Label1 x) = x
{ can't do either of these:
newtype PersonId = PersonId Int  no type param
newtype PersonId a = PersonId Int  type param not used
}
data PersonId a where
PersonId :: !Int > PersonId Int
instance GetLabel (PersonId a) a where getLabel (PersonId x) = x
 using tuples as anon records:
instance (a ~ a', GetLabel (l a) a') => HasField l (l a, lb, lc) a' where
getField _ (x, _, _) = getLabel x
Of course there are many design choices for anon records and how to label their fields. (I'm trying not to prejudge that.) But they'll all need that sort of typeindexed lookup  including I think if the label is a typelevel string.
Those GetLabel instances are tedious. When we have fullbore ORF, we can declare every data type using record syntax, all with field name unLabel.
data PersonId a where
PersonId :: { unLabel :: !Int } > PersonId Int
... getField _ (x, _, _) = unLabel x
Layering evidence
dataFoo a b c =Foo1 a b Foo2 a b
dataBar f a b c whereBar::Family1 a b ~True=>!(f a b c)>Bar a b c
dataBaz f a b c whereBaz::Family2 b c ~True=>!(f a b c)>Baz a b c
newtypeQuux f a b c =Quux(Baz(BarFoo) a b c)
With the newtype optimization, I can layer the Baz
type information on top of the Bar
type information on top of the Foo
type without having to pay a penalty. The alternative today is to make an entirely separate GADT for each combination of type information I want to hold evidence of.
Implementation
Implementation is unfortunately tricky. Simply eliminating the boxing in Stg is easy, and this by itself saves us two words per value + pointer dereferencing. However, the generated code will be ugly, and if we could do this in Core instead of Stg, the simplifier would be able to do some followup optimizations and generate good code.
To be more specific, we want to do these transformations:
First:
D arg1 arg2 ... argN
==>
nv_arg (where nv_arg is the only nonvoid argument)
(but we somehow need to bind other args or do substitution. If we do this Stg though we don't need to bind those args as unarise doesn't care about what a void argument is as long as it's void it gets rid of it and it can check voidness by looking at Id's type)
Second:
case <exp1> of
D arg1 arg2 ... argN > <exp2>
==>
let arg1 = ...
arg2 = ...
arg3 = ...
in <exp2>
(we know only one of these args will be nonvoid, but all of them should be bound as they can be referred in <exp2>)
If we do this in Stg we lose some optimization opportunities and generate ugly code. For example, if the first transformation happens in a letbinding RHS maybe simplifier decides to inline it as it can't duplicate work after the transformation. Similarly it can decide to inline the nonvoid argument after second transformation which may lead to further optimizations etc.
For an example of an ugly code, suppose we had this:
case <exp1> of
D (T x) > <exp2>
in Stg this looks like
case <exp1> of
D v > case v of
T x > <exp2>
So now if we do the second transformation we get
let v = <exp1> in
case v of
T x > <exp2>
but ideally we'd get
case <exp1> of
T x > <exp2>
Simplifier would be able to do this after the second transformation.
So the problem is

If we implement this in Stg we generate ugly code, and miss some optimization opportunities (and arguably it doesn't buy us much, it saves 2 words per allocation + pointer dereferencing)

Implementing this in Core is very hard, if not impossible, without losing type safety.
(copied from the mailing list discussion)