!2835: Make DeriveFunctor-generated code require fewer beta reductions · Merge requests · Glasgow Haskell Compiler / GHC

Ryan Scott requested to merge wip/T17880 into master Mar 06, 2020

Issue #17880 (closed) demonstrates that DeriveFunctor-generated code is surprisingly fragile when rank-n types are involved. The culprit is that $fmap (the algorithm used to generate fmap implementations) was too keen on applying arguments with rank-n types to lambdas, which fail to typecheck more often than not.

In this patch, I change $fmap (both the specification and the implementation) to produce code that avoids creating as many lambdas, avoiding problems when rank-n field types arise. See the comments titled "Functor instances" in TcGenFunctor for a more detailed description. Not only does this fix #17880 (closed), but it also ensures that the code that DeriveFunctor generates will continue to work after simplified subsumption is implemented (see #17775 (closed)).

What is truly amazing is that #17880 (closed) is actually a regression (introduced in GHC 7.6.3) caused by commit 49ca2a37, the fix #7436 (closed). Prior to that commit, the version of $fmap that was used was almost identical to the one used in this patch! Why did that commit change $fmap then? It was to avoid severe performance issues that would arise for recursive fmap implementations, such as in the example below:

data List a = Nil | Cons a (List a) deriving Functor

-- ===>

instance Functor List where
  fmap f Nil = Nil
  fmap f (Cons x xs) = Cons (f x) (fmap (\y -> f y) xs)

The fact that \y -> f y was eta expanded caused significant performance overheads. Commit 49ca2a37 fixed this performance issue, but it went too far. As a result, this patch partially reverts 49ca2a37.

To ensure that the performance issues pre-#7436 do not resurface, I have taken some precautionary measures:

I have added a special case to $fmap for situations where the last type variable in an application of some type occurs directly. If this special case fires, we avoid creating a lambda expression. This ensures that we generate fmap f (Cons x xs) = Cons (f x) (fmap f xs) in the derived Functor List instance above. For more details, see Note [Avoid unnecessary eta expansion in derived fmap implementations] in TcGenFunctor.
I have added a T7436b test case to ensure that the performance of this derived Functor List-style code does not regress.

When implementing this, I discovered that $replace, the algorithm which generates implementations of (<$), has a special case that is very similar to the $fmap special case described above. $replace marked this special case with a custom Replacer data type, which was a bit overkill. In order to use the same machinery for both Functor methods, I ripped out Replacer and instead implemented a simple way to detect the special case. See the updated commentary in Note [Deriving <$] for more details.

Edited Mar 10, 2020 by Ryan Scott

Make DeriveFunctor-generated code require fewer beta reductions

Merge request reports