Make DeriveFunctor-generated code require fewer beta reductions
Issue #17880 (closed) demonstrates that DeriveFunctor
-generated code is
surprisingly fragile when rank-n types are involved. The culprit is
that $fmap
(the algorithm used to generate fmap
implementations)
was too keen on applying arguments with rank-n types to lambdas,
which fail to typecheck more often than not.
In this patch, I change $fmap
(both the specification and the
implementation) to produce code that avoids creating as many lambdas,
avoiding problems when rank-n field types arise.
See the comments titled "Functor instances" in TcGenFunctor
for a
more detailed description. Not only does this fix #17880 (closed), but it also
ensures that the code that DeriveFunctor
generates will continue
to work after simplified subsumption is implemented (see #17775 (closed)).
What is truly amazing is that #17880 (closed) is actually a regression
(introduced in GHC 7.6.3) caused by commit
49ca2a37, the fix #7436 (closed). Prior to
that commit, the version of $fmap
that was used was almost
identical to the one used in this patch! Why did that commit change
$fmap
then? It was to avoid severe performance issues that would
arise for recursive fmap
implementations, such as in the example
below:
data List a = Nil | Cons a (List a) deriving Functor
-- ===>
instance Functor List where
fmap f Nil = Nil
fmap f (Cons x xs) = Cons (f x) (fmap (\y -> f y) xs)
The fact that \y -> f y
was eta expanded caused significant
performance overheads. Commit
49ca2a37 fixed this performance
issue, but it went too far. As a result, this patch partially
reverts 49ca2a37.
To ensure that the performance issues pre-#7436 do not resurface, I have taken some precautionary measures:
- I have added a special case to
$fmap
for situations where the last type variable in an application of some type occurs directly. If this special case fires, we avoid creating a lambda expression. This ensures that we generatefmap f (Cons x xs) = Cons (f x) (fmap f xs)
in the derivedFunctor List
instance above. For more details, seeNote [Avoid unnecessary eta expansion in derived fmap implementations]
inTcGenFunctor
. - I have added a
T7436b
test case to ensure that the performance of this derivedFunctor List
-style code does not regress.
When implementing this, I discovered that $replace
, the algorithm
which generates implementations of (<$)
, has a special case that is
very similar to the $fmap
special case described above. $replace
marked this special case with a custom Replacer
data type, which
was a bit overkill. In order to use the same machinery for both
Functor
methods, I ripped out Replacer
and instead implemented
a simple way to detect the special case. See the updated commentary
in Note [Deriving <$]
for more details.