Foldl' doesn't inline when partially applied
Summary
It is common to write functions that iterate over a structure using foldl' in an eta-reduced way, e.g. from a recent reddit thread:
-- SumLen.hs:
{-# LANGUAGE BangPatterns #-}
module SumLen where
import Data.List
sumlen :: [Float] -> (Float, Int)
sumlen = foldl' f (0, 0)
where
f (!s, !n) !x = (s + x, n + 1)
This, however, does not produce performant Core, the cleaned up Core looks like this:
sumlen4
= \ ds x ->
case ds of { (s, n) ->
case s of { F# ipv ->
case n of { I# ipv1 ->
case x of { F# ipv2 ->
(F# (plusFloat# ipv ipv2), I# (+# ipv1 1#))
}
}
}
}
sumlen = foldl' sumlen4 (F# 0.0#, I# 0#)
If instead we eta-expand the sumlen function:
sumlen :: [Float] -> (Float, Int)
sumlen xs = foldl' f (0, 0) xs
where
f (!s, !n) !x = (s + x, n + 1)
We do get efficient Core that does not allocate during the loop:
sumlen_$s$wgo
= \ sc sc1 sc2 ->
case sc2 of {
[] -> (# F# sc1, I# sc #);
: y ys ->
case y of { F# ipv ->
sumlen_$s$wgo (+# sc 1#) (plusFloat# sc1 ipv) ys
}
}
sumlen
= \ w ->
case sumlen_$s$wgo 0# 0.0# w of { (# ww1, ww2 #) -> (ww1, ww2) }
Steps to reproduce
Put the SumLen
example in a file SumLen.hs
and compile it with optimizations and simplifier dumping:
$ ghc -O2 -ddump-simpl -dsuppress-all -dsuppress-uniques SumLen.hs
This produces Core like the first example shown above.
Expected behavior
I expected it to produce more efficient Core like the second example shown above.
Environment
- GHC version used: 8.10.4 and 9.0.1
Optional:
- Operating System: Debian buster
- System Architecture: x86_64