Better eta reduction gains asymptotic performance

Here is a remarkable program, from #7436 (closed)

data Nat = Z | S Nat

mkNat :: Int -> Nat
mkNat 0 = Z
mkNat n = S (mkNat (n-1))

unNat :: Nat -> ()
unNat Z = ()
unNat (S n) = unNat n

fast :: (b -> b) -> b -> Nat -> b
fast s z Z = z
fast s z (S n) = s (fast s z n)

slow :: (b -> b) -> b -> Nat -> b
slow s z Z = z
slow s z (S n) = s (slow (\e -> s e) z n)

main :: IO ()
--main = print $ unNat . fast S Z . mkNat $ n
main = print $ unNat . slow S Z . mkNat $ n
  where n = 100000

Using slow it takes 8 seconds to run; using fast it is instantaneous. The only difference between slow and fast is the eta-expansion in slow; yet it causes an asymptotic slowdown! See this comment for how this very surprising behaviour occurs.

The story in #7436 (closed) (and #17880 (closed)) is to generate fast rather than slow. But the purpose of this ticket is to think about how to tranform slow into fast.

Why can't GHC eta-reduce the lambda in slow? Because

It does not know that s isn't bottom
It worries that slow (the function to which (\e -> s e) is passed) migth do a seq on the argument.

Indeed (1) might happen. But (2) does not! So eta-reduction would be valid.

Moreover, the demand analyser discovers this signature for slow:

   Str=<L,C(U)><L,1*U><S,1*U>,

The C(U) says that s is only called, not seq'd. If we change slow to

slow s z Z = s `seq` z

then we get Str=<S,U><L,1*U><S,1*U>.

So the opportunity is to somehow make eta-reduction sensitive to the enclosing demand. I had a quick look at the Simplifier, and while I think it's possible, I don't think it's especially easy. Something like:

Put the demand in the Stop constructor of SimplCont
Pass the continuation to SimplUtils.mkLam in the call in Simplify.simplLam.

That is quite a lot of fiddling for a relatively narrow case. But still, it hurts to lose asymptotic perf from something so trivial.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information