I know we don't worry too much about performance when optimizations are disabled, but last [1..n] using O(n) space when the naive definition would run in O(1) space is a bit too much I think.
Can we split out the new definition to a separate function and use a rule to rewrite last to that function when optimizations are enabled? I'm not expert enough on the use of rules and inlining phases etc. to write the patch myself, but I think it should be simple.
What is the "new definition"? I'm pretty hazy about what exactly you are proposing here. Could you write it out explicitly and then the Core Libraries Committee can decide.
@rwbarton, just to make sure I don’t get you wrong: Are you talking about compiling a "normal" program using last, using a normally build compiler, or are you talking about compiling the base libraries without optimization, e.g. during GHC development?
I had a closer look, and the problem is the following:
When we added the implementation
last = foldl (\_ x -> x) (error "..."))
to GHC.List in #9339 (closed), we assumed (and I thought we checked, but maybe that not true) that GHC would optimize the core will look something like this:
This way, when using last in code compiled without -O, the efficient variant would be called, while with -O the unfolding would be used and optimized on the spot, possibly fusing, and producing good code if Call Arity kicks in.
@rwbarton, just to make sure I don’t get you wrong: Are you talking about compiling a "normal" program using last, using a normally build compiler, or are you talking about compiling the base libraries without optimization, e.g. during GHC development?
I mean using a compiler built with "perf", e.g. the 7.10.1 release, and compiling the user program without optimization.
Re ticket:10260#comment:98687 I too was surprised that last was not eta-expanded. But see this note in SimplUtils:
Note [Do not eta-expand PAPs]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~We used to have old_arity = manifestArity rhs, which meant that wewould eta-expand even PAPs. But this gives no particular advantage,and can lead to a massive blow-up in code size, exhibited by Trac #9020.Suppose we have a PAP foo :: IO () foo = returnIO ()Then we can eta-expand do foo = (\eta. (returnIO () |> sym g) eta) |> gwhere g :: IO () ~ State# RealWorld -> (# State# RealWorld, () #)But there is really no point in doing this, and it generates masses ofcoercions and whatnot that eventually disappear again. For T9020, GHCallocated 6.6G beore, and 0.8G afterwards; and residency dropped from1.8G to 45M.But note that this won't eta-expand, say f = \g -> map gDoes it matter not eta-expanding such functions? I'm not sure. Perhapsstrictness analysis will have less to bite on?
The worse code for the un-eta-expanded code for last is perhaps an example of the last paragraph!
And yet the comment is reasonably convincing. I'm not sure what to do here.
Joachim, one thing that might be worth trying is to eta-expand as far as poss without bumping into a coercion. That would expand last but not foo. Want to try that? We may fix last but there are sure to be other cases, and the more robust the optimisations the better.
And yet the comment is reasonably convincing. I'm not sure what to do here.
I think we are doing the right thing here. As far as I know, we also take the number of arguments given by the user as an indication when things should inline or not; if we eta-expand PAPs, we take this possibility of control away from the user.
In this particular case: What if I had written last this way intentional, so that the rule for foldl would not fire here, but only after last was inlined at a position where it is passed an argument?
(Or am I talking non-sense if we are talking about PAPs, in contrast to functions that might be eta-expanded more.)
Well all eta-expansion risks what you say. If the user wants hat kind of control s/he can use INLINE/INLINEABLE. The eta expansion doesn't happen for the template that is inlined; only in the RHS that is compiled. Which is good. So I do think we should eta-expand here. The only reason not to is described in the comment. I think.
Ah, right, the template is stored before this happens; good.
What is our motivation to eta-expand PAPs? In this case, it is to allow for RULES to fire. Is it not likely that in order for RULES to fire, we might have to eta-expand beyond a coercion? So it does not gain a lot in robustness.
I’m inclined to wait for more than just this example to give this a shot, after all, implementing the heuristics (“if it is a PAP, and there is a chance to eta-expand, but only up to the next coercion...”) does not really make the code simpler...