Skip to content
  • Ryan Scott's avatar
    cb93a1a4
    Make DeriveFunctor-generated code require fewer beta reductions · cb93a1a4
    Ryan Scott authored and Marge Bot's avatar Marge Bot committed
    Issue #17880 demonstrates that `DeriveFunctor`-generated code is
    surprisingly fragile when rank-_n_ types are involved. The culprit is
    that `$fmap` (the algorithm used to generate `fmap` implementations)
    was too keen on applying arguments with rank-_n_ types to lambdas,
    which fail to typecheck more often than not.
    
    In this patch, I change `$fmap` (both the specification and the
    implementation) to produce code that avoids creating as many lambdas,
    avoiding problems when rank-_n_ field types arise.
    See the comments titled "Functor instances" in `TcGenFunctor` for a
    more detailed description. Not only does this fix #17880, but it also
    ensures that the code that `DeriveFunctor` generates will continue
    to work after simplified subsumption is implemented (see #17775).
    
    What is truly amazing is that #17880 is actually a regression
    (introduced in GHC 7.6.3) caused by commit
    49ca2a37, the fix #7436. Prior to
    that commit, the version of `$fmap` that was used was almost
    identical to the one used in this patch! Why did that commit change
    `$fmap` then? It was to avoid severe performance issues that would
    arise for recursive `fmap` implementations, such as in the example
    below:
    
    ```hs
    data List a = Nil | Cons a (List a) deriving Functor
    
    -- ===>
    
    instance Functor List where
      fmap f Nil = Nil
      fmap f (Cons x xs) = Cons (f x) (fmap (\y -> f y) xs)
    ```
    
    The fact that `\y -> f y` was eta expanded caused significant
    performance overheads. Commit
    49ca2a37 fixed this performance
    issue, but it went too far. As a result, this patch partially
    reverts 49ca2a37.
    
    To ensure that the performance issues pre-#7436 do not resurface,
    I have taken some precautionary measures:
    
    * I have added a special case to `$fmap` for situations where the
      last type variable in an application of some type occurs directly.
      If this special case fires, we avoid creating a lambda expression.
      This ensures that we generate
      `fmap f (Cons x xs) = Cons (f x) (fmap f xs)` in the derived
      `Functor List` instance above. For more details, see
      `Note [Avoid unnecessary eta expansion in derived fmap implementations]`
      in `TcGenFunctor`.
    * I have added a `T7436b` test case to ensure that the performance
      of this derived `Functor List`-style code does not regress.
    
    When implementing this, I discovered that `$replace`, the algorithm
    which generates implementations of `(<$)`, has a special case that is
    very similar to the `$fmap` special case described above. `$replace`
    marked this special case with a custom `Replacer` data type, which
    was a bit overkill. In order to use the same machinery for both
    `Functor` methods, I ripped out `Replacer` and instead implemented
    a simple way to detect the special case. See the updated commentary
    in `Note [Deriving <$]` for more details.
    cb93a1a4
    Make DeriveFunctor-generated code require fewer beta reductions
    Ryan Scott authored and Marge Bot's avatar Marge Bot committed
    Issue #17880 demonstrates that `DeriveFunctor`-generated code is
    surprisingly fragile when rank-_n_ types are involved. The culprit is
    that `$fmap` (the algorithm used to generate `fmap` implementations)
    was too keen on applying arguments with rank-_n_ types to lambdas,
    which fail to typecheck more often than not.
    
    In this patch, I change `$fmap` (both the specification and the
    implementation) to produce code that avoids creating as many lambdas,
    avoiding problems when rank-_n_ field types arise.
    See the comments titled "Functor instances" in `TcGenFunctor` for a
    more detailed description. Not only does this fix #17880, but it also
    ensures that the code that `DeriveFunctor` generates will continue
    to work after simplified subsumption is implemented (see #17775).
    
    What is truly amazing is that #17880 is actually a regression
    (introduced in GHC 7.6.3) caused by commit
    49ca2a37, the fix #7436. Prior to
    that commit, the version of `$fmap` that was used was almost
    identical to the one used in this patch! Why did that commit change
    `$fmap` then? It was to avoid severe performance issues that would
    arise for recursive `fmap` implementations, such as in the example
    below:
    
    ```hs
    data List a = Nil | Cons a (List a) deriving Functor
    
    -- ===>
    
    instance Functor List where
      fmap f Nil = Nil
      fmap f (Cons x xs) = Cons (f x) (fmap (\y -> f y) xs)
    ```
    
    The fact that `\y -> f y` was eta expanded caused significant
    performance overheads. Commit
    49ca2a37 fixed this performance
    issue, but it went too far. As a result, this patch partially
    reverts 49ca2a37.
    
    To ensure that the performance issues pre-#7436 do not resurface,
    I have taken some precautionary measures:
    
    * I have added a special case to `$fmap` for situations where the
      last type variable in an application of some type occurs directly.
      If this special case fires, we avoid creating a lambda expression.
      This ensures that we generate
      `fmap f (Cons x xs) = Cons (f x) (fmap f xs)` in the derived
      `Functor List` instance above. For more details, see
      `Note [Avoid unnecessary eta expansion in derived fmap implementations]`
      in `TcGenFunctor`.
    * I have added a `T7436b` test case to ensure that the performance
      of this derived `Functor List`-style code does not regress.
    
    When implementing this, I discovered that `$replace`, the algorithm
    which generates implementations of `(<$)`, has a special case that is
    very similar to the `$fmap` special case described above. `$replace`
    marked this special case with a custom `Replacer` data type, which
    was a bit overkill. In order to use the same machinery for both
    `Functor` methods, I ripped out `Replacer` and instead implemented
    a simple way to detect the special case. See the updated commentary
    in `Note [Deriving <$]` for more details.
Loading