Better eta expansion and reduction
This patch set implements a bunch of changes relate to eta reduction and expansion.
I have been working on them for a ridiculously long time, playing whack-a-mole. It has been a painful journey, but finally the moles are whacked. CI numbers look good.
This MR subsumes !6197 (closed).
commit de30194d8c9c29beef80772d0c8a75fd135a1b96
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Fri Jul 23 23:57:01 2021 +0100
A bunch of changes related to eta reduction
This is a large collection of changes all relating to eta
reduction, originally triggered by #18993, but there followed
a long saga.
Specifics:
* Move state-hack stuff from GHC.Types.Id (where it never belonged)
to GHC.Core.Opt.Arity (which seems much more appropriate).
* Add a crucial mkCast in the Cast case of
GHC.Core.Opt.Arity.eta_expand; helps with T18223
* Add clarifying notes about eta-reducing to PAPs.
See Note [Do not eta reduce PAPs]
* I moved tryEtaReduce from GHC.Core.Utils to GHC.Core.Opt.Arity,
where it properly belongs. See Note [Eta reduce PAPs]
* In GHC.Core.Opt.Simplify.Utils.tryEtaExpandRhs, pull out the code for
when eta-expansion is wanted, to make wantEtaExpansion, and all that
same function in GHC.Core.Opt.Simplify.simplStableUnfolding. It was
previously inconsistent, but it's doing the same thing.
* I did a substantial refactor of ArityType; see Note [ArityType].
This allowed me to do away with the somewhat mysterious takeOneShots;
more generally it allows arityType to describe the function, leaving
its clients to decide how to use that information.
I made ArityType abstract, so that clients have to use functions
to access it.
* Make GHC.Core.Opt.Simplify.Utils.rebuildLam (was stupidly called
mkLam before) aware of the floats that the simplifier builds up, so
that it can still do eta-reduction even if there are some floats.
(Previously that would not happen.) That means passing the floats
to rebuildLam, and an extra check when eta-reducting (etaFloatOk).
* In GHC.Core.Opt.Simplify.Utils.tryEtaExpandRhs, make use of call-info
in the idDemandInfo of the binder, as well as the CallArity info. The
occurrence analyser did this but we were failing to take advantage here.
In the end I moved the heavy lifting to GHC.Core.Opt.Arity.findRhsArity;
see Note [Combining arityType with demand info], and functions
idDemandOneShots and combineWithDemandOneShots.
(These changes partly drove my refactoring of ArityType.)
* In GHC.Core.Opt.Arity.findRhsArity
* I'm now taking account of the demand on the binder to give
extra one-shot info. E.g. if the fn is always called with two
args, we can give better one-shot info on the binders
than if we just look at the RHS.
* Don't do any fixpointing in the non-recursive
case -- simple short cut.
* Trim arity inside the loop. See Note [Trim arity inside the loop]
* Make SimpleOpt respect the eta-reduction flag
(Some associated refactoring here.)
* I made the CallCtxt which the Simplifier uses distinguish between
recursive and non-recursive right-hand sides.
data CallCtxt = ... | RhsCtxt RecFlag | ...
It affects only one thing:
- We call an RHS context interesting only if it is non-recursive
see Note [RHS of lets] in GHC.Core.Unfold
* Remove eta-reduction in GHC.CoreToStg.Prep, a welcome simplification.
See Note [No eta reduction needed in rhsToBody] in GHC.CoreToStg.Prep.
Other incidental changes
* Fix a fairly long-standing outright bug in the ApplyToVal case of
GHC.Core.Opt.Simplify.mkDupableContWithDmds. I was failing to take the
tail of 'dmds' in the recursive call, which meant the demands were All
Wrong. I have no idea why this has not caused problems before now.
* Delete dead function GHC.Core.Opt.Simplify.Utils.contIsRhsOrArg
Metrics: compile_time/bytes allocated
Test Metric Baseline New value Change
---------------------------------------------------------------------------------------
MultiLayerModulesTH_OneShot(normal) ghc/alloc 2,743,297,692 2,619,762,992 -4.5% GOOD
T18223(normal) ghc/alloc 1,103,161,360 972,415,992 -11.9% GOOD
T3064(normal) ghc/alloc 201,222,500 184,085,360 -8.5% GOOD
T8095(normal) ghc/alloc 3,216,292,528 3,254,416,960 +1.2%
T9630(normal) ghc/alloc 1,514,131,032 1,557,719,312 +2.9% BAD
parsing001(normal) ghc/alloc 530,409,812 525,077,696 -1.0%
geo. mean -0.1%
Nofib:
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
banner +0.0% +0.4% -8.9% -8.7% 0.0%
exact-reals +0.0% -7.4% -36.3% -37.4% 0.0%
fannkuch-redux +0.0% -0.1% -1.0% -1.0% 0.0%
fft2 -0.1% -0.2% -17.8% -19.2% 0.0%
fluid +0.0% -1.3% -2.1% -2.1% 0.0%
gg -0.0% +2.2% -0.2% -0.1% 0.0%
spectral-norm +0.1% -0.2% 0.0% 0.0% 0.0%
tak +0.0% -0.3% -9.8% -9.8% 0.0%
x2n1 +0.0% -0.2% -3.2% -3.2% 0.0%
--------------------------------------------------------------------------------
Min -3.5% -7.4% -58.7% -59.9% 0.0%
Max +0.1% +2.2% +32.9% +32.9% 0.0%
Geometric Mean -0.0% -0.1% -14.2% -14.8% -0.0%
Metric Decrease:
MultiLayerModulesTH_OneShot
T18223
T3064
Metric Increase:
T9630
commit eb82a8663e5417e8dc669f51c4a6ab1c49be1837
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Fri Jul 23 23:35:21 2021 +0100
Make PrimOpId record levity
This patch concerns #20155, part (1)
The general idea is that since primops have curried bindings
(currently in PrimOpWrappers.hs) we don't need to eta-expand
them. But we /do/ need to eta-expand the levity-polymorphic ones,
because they /don't/ have bindings.
This patch makes a start in that direction, by identifying the
levity-polymophic primops in the PrimOpId IdDetails constructor.
For the moment, I'm still eta-expanding all primops (by saying
that hasNoBinding returns True for all primops), because of the
bug reported in #20155. But I hope that before long we can
tidy that up too, and remove the TEMPORARILY stuff in hasNoBinding.
commit 3f8a2eb8a2198104652a995465233ee0ecb11ec7
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Fri Jul 23 23:33:10 2021 +0100
Comments and white space
commit dc0d425d3db67bfcc19e3201af96429c91365026
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Wed Jun 23 20:53:27 2021 +0100
Make findRhsArity take RecFlag
This avoids a fixpoint iteration for the common case of
non-recursive bindings.
commit dec704fbd391c52b46b52c1c003b2a2772fc415e
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Tue Jun 15 22:49:40 2021 +0100
Do arity trimming at bindings, rather than in exprArity
Sometimes there are very large casts, and coercionRKind
can be slow.
commit 3adf5b9f0633f0add92b7e31b836f6bf0c70f7a3
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Sat May 29 22:49:34 2021 +0100
Eta reduction with casted function
We want to be able to eta-reduce
\x y. ((f x) |> co) y
by pushing 'co' inwards. A very small change accommodates this
See Note [Eta reduction with casted function]