Commit 5b986a4d authored by Simon Peyton Jones's avatar Simon Peyton Jones Committed by Joachim Breitner

CSE code cleanup and improvement

Triggered by an observation by Joachim, Simon felt the urge to clean up
the CSE code a bit. This is the result.

(Code by Simon, commit message and other leg-work by Joachim)

Differential Revision: https://phabricator.haskell.org/D2074
parent 2265c849
......@@ -12,11 +12,12 @@ module CSE (cseProgram) where
import CoreSubst
import Var ( Var )
import Id ( Id, idType, idInlineActivation, zapIdOccInfo, zapIdUsageInfo )
import Id ( Id, idType, idUnfolding, idInlineActivation
, zapIdOccInfo, zapIdUsageInfo )
import CoreUtils ( mkAltExpr
, exprIsTrivial
, stripTicksE, stripTicksT, stripTicksTopE, mkTick, mkTicks )
import Type ( tyConAppArgs )
, exprIsTrivial, exprOkForSpeculation
, stripTicksE, stripTicksT, mkTicks )
import Type ( Type, tyConAppArgs, isUnliftedType )
import CoreSyn
import Outputable
import BasicTypes ( isAlwaysActive )
......@@ -59,34 +60,78 @@ Here we must *not* do CSE on the inner x+x! The simplifier used to guarantee no
shadowing, but it doesn't any more (it proved too hard), so we clone as we go.
We can simply add clones to the substitution already described.
Note [Case binders 1]
~~~~~~~~~~~~~~~~~~~~~~
Consider
f = \x -> case x of wild {
(a:as) -> case a of wild1 {
(p,q) -> ...(wild1:as)...
Note [CSE for bindings]
~~~~~~~~~~~~~~~~~~~~~~~
Let-bindings have two cases, implemnted by cseRhs.
* Trivial RHS:
let x = y in ...(h x)....
Here we want to extend the /substitution/ with x -> y, so that the
(h x) in the body might CSE with an enclosing (let v = h y in ...).
NB: the substitution maps InIds, so we extend the substitution with
a biding for the original InId 'x'
How can we have a trivial RHS? Doens't the simplifier inline them?
- First, the original RHS might have been (g z) which has CSE'd
with an enclosing (let y = g z in ...). This is super-important.
See Trac #5996:
x1 = C a b
x2 = C x1 b
y1 = C a b
y2 = C y1 b
Here we CSE y1's rhs to 'x1', and then we must add (y1->x1) to
the substitution so that we can CSE the binding for y2.
- Second, we use cseRHS for case expression scrutinees too;
see Note [CSE for case expressions]
Here, (wild1:as) is morally the same as (a:as) and hence equal to wild.
But that's not quite obvious. In general we want to keep it as (wild1:as),
but for CSE purpose that's a bad idea.
* Non-trivial RHS
let x = h y in ...(h y)...
So we add the binding (wild1 -> a) to the extra var->var mapping.
Notice this is exactly backwards to what the simplifier does, which is
to try to replaces uses of 'a' with uses of 'wild1'
Here we want to extend the /reverse mapping (cs_map)/ so that
we CSE the (h y) call to x.
Note [Case binders 2]
~~~~~~~~~~~~~~~~~~~~~~
Notice that
- the trivial-RHS situation extends the substitution (cs_subst)
- the non-trivial-RHS situation extends the reverse mapping (cs_map)
Note [CSE for case expressions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider
case (h x) of y -> ...(h x)...
case scrut_expr of x { ...alts... }
This is very like a strict let-binding
let !x = scrut_expr in ...
So we use (cseRhs x scrut_expr) to process scrut_expr and x, and as a
result all the stuff under Note [CSE for bindings] applies directly.
For example:
* Trivial scrutinee
f = \x -> case x of wild {
(a:as) -> case a of wild1 {
(p,q) -> ...(wild1:as)...
We'd like to replace (h x) in the alternative, by y. But because of
the preceding [Note: case binders 1], we only want to add the mapping
scrutinee -> case binder
to the reverse CSE mapping if the scrutinee is a non-trivial expression.
(If the scrutinee is a simple variable we want to add the mapping
case binder -> scrutinee
to the substitution
Here, (wild1:as) is morally the same as (a:as) and hence equal to
wild. But that's not quite obvious. In the rest of the compiler we
want to keep it as (wild1:as), but for CSE purpose that's a bad
idea.
By using cseRhs we add the binding (wild1 -> a) to the substitution,
which does exactly the right thing.
(Notice this is exactly backwards to what the simplifier does, which
is to try to replaces uses of 'a' with uses of 'wild1'.)
This is the main reason that cseRHs is called with a trivial rhs.
* Non-trivial scrutinee
case (f x) of y { pat -> ...let y = f x in ... }
By using cseRhs we'll add (f x :-> y) to the cs_map, and
thereby CSE the inner (f x) to y.
Note [CSE for INLINE and NOINLINE]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -138,13 +183,50 @@ an Id, even if is a 'stable' unfolding. That means that when an
unfolding happens, it is always faithful to what the stable unfolding
originally was.
Note [CSE for case expressions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Note [CSE for stable unfoldings]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider
case f x of y { pat -> ...let y = f x in ... }
Then we can CSE the inner (f x) to y. In fact 'case' is like a strict
let-binding, and we can use cseRhs for dealing with the scrutinee.
{-# Unf = Stable (\pq. build blah) #-}
foo = x
Here 'foo' has a stable unfolding, but its (optimised) RHS is trivial.
(Turns out that this actually happens for the enumFromTo method of
the Integer instance of Enum in GHC.Enum.) Then we obviously do NOT
want to extend the substitution with (foo->x)! See similar
SimplUtils Note [Stable unfoldings and postInlineUnconditionally].
Nor do we want to change the reverse mapping. Suppose we have
{-# Unf = Stable (\pq. build blah) #-}
foo = <expr>
bar = <expr>
There could conceivably be merit in rewriting the RHS of bar:
bar = foo
but now bar's inlining behaviour will change, and importing
modules might see that. So it seems dodgy and we don't do it.
Note [Corner case for case expressions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consdider
case x |> co of (y::Array# Int) { ... }
Is it ok to extend the substutition with (y -> x |> co)?
Because y is of unlifted type, this is only OK if (x |> co) is
ok-for-speculation, else we'll destroy the let/app invariant.
But surely it is ok-for-speculation, becasue it's a trivial
expression, and x's type is also unlifted, presumably.
Well, maybe not if you are using unsafe casts. I actually found
a case where we had
(x :: HValue) |> (UnsafeCo :: HValue ~ Array# Int)
This is a vanishingly strange corner case, but we still have
to check.
We do the check in cseRhs, but it can't fire when cseRhs is called
from a let-binding, becuase they are always ok-for-speculation. Never
mind!
************************************************************************
* *
......@@ -161,67 +243,62 @@ cseBind env (NonRec b e)
= (env2, NonRec b'' e')
where
(env1, b') = addBinder env b
(env2, (b'', e')) = cseRhs env1 (b',e)
(env2, (b'', e')) = cseRhs env1 b b' e
cseBind env (Rec pairs)
= (env2, Rec pairs')
where
(bs,es) = unzip pairs
(env1, bs') = addRecBinders env bs
(env2, pairs') = mapAccumL cseRhs env1 (bs' `zip` es)
cseRhs :: CSEnv -> (OutBndr, InExpr) -> (CSEnv, (OutBndr, OutExpr))
cseRhs env (id',rhs)
= case lookupCSEnv env rhs'' of
Nothing
| always_active -> (extendCSEnv env rhs' id', (zapped_id, rhs'))
| otherwise -> (env, (id', rhs'))
Just id
| always_active -> (extendCSSubst env id' id_expr, (id', mkTicks ticks id_expr))
| otherwise -> (env, (id', mkTicks ticks id_expr))
where
id_expr = varToCoreExpr id -- Could be a CoVar
-- In the Just case, we have
-- x = rhs
-- ...
-- x' = rhs
-- We are replacing the second binding with x'=x
-- and so must record that in the substitution so
-- that subsequent uses of x' are replaced with x,
-- See Trac #5996
(env1, bs') = addRecBinders env (map fst pairs)
(env2, pairs') = mapAccumL cse_rhs env1 (bs' `zip` pairs)
cse_rhs env (b', (b,e)) = cseRhs env b b' e
cseRhs :: CSEnv -> InId -> OutId -> InExpr -> (CSEnv, (OutId, OutExpr))
cseRhs env in_id out_id rhs
| no_cse = (env, (out_id, rhs'))
| ok_to_subst = (extendCSSubst env in_id rhs', (out_id, rhs'))
| otherwise = (extendCSEnv env rhs' id_expr', (zapped_id, rhs'))
where
zapped_id = zapIdUsageInfo id'
-- Putting the Id into the environment makes it possible that
id_expr' = varToCoreExpr out_id
rhs' = tryForCSE env rhs
zapped_id = zapIdUsageInfo out_id
-- Putting the Id into the cs_map makes it possible that
-- it'll become shared more than it is now, which would
-- invalidate (the usage part of) its demand info. This caused
-- Trac #100218.
-- invalidate (the usage part of) its demand info.
-- This caused Trac #100218.
-- Easiest thing is to zap the usage info; subsequently
-- performing late demand-analysis will restore it. Don't zap
-- the strictness info; it's not necessary to do so, and losing
-- it is bad for performance if you don't do late demand
-- analysis
rhs' = cseExpr env rhs
ticks = stripTicksT tickishFloatable rhs'
rhs'' = stripTicksE tickishFloatable rhs'
-- We don't want to lose the source notes when a common sub
-- expression gets eliminated. Hence we push all (!) of them on
-- top of the replaced sub-expression. This is probably not too
-- useful in practice, but upholds our semantics.
no_cse = not (isAlwaysActive (idInlineActivation out_id))
-- See Note [CSE for INLINE and NOINLINE]
|| isStableUnfolding (idUnfolding out_id)
-- See Note [CSE for stable unfoldings]
always_active = isAlwaysActive (idInlineActivation id')
-- See Note [CSE for INLINE and NOINLINE]
-- See Note [CSE for bindings]
ok_to_subst = exprIsTrivial rhs'
&& (not (isUnliftedType (idType out_id))
|| exprOkForSpeculation rhs')
-- See Note [Corner case for case expressions]
tryForCSE :: CSEnv -> InExpr -> OutExpr
tryForCSE env expr
| exprIsTrivial expr' = expr' -- No point
| Just smaller <- lookupCSEnv env expr'' = foldr mkTick (Var smaller) ticks
| otherwise = expr'
| exprIsTrivial expr' = expr' -- No point
| Just e <- lookupCSEnv env expr'' = mkTicks ticks e
| otherwise = expr'
-- The varToCoreExpr is needed if we have
-- case e of xco { ...case e of yco { ... } ... }
-- Then CSE will substitute yco -> xco;
-- but these are /coercion/ variables
where
expr' = cseExpr env expr
expr' = cseExpr env expr
expr'' = stripTicksE tickishFloatable expr'
ticks = stripTicksT tickishFloatable expr'
ticks = stripTicksT tickishFloatable expr'
-- We don't want to lose the source notes when a common sub
-- expression gets eliminated. Hence we push all (!) of them on
-- top of the replaced sub-expression. This is probably not too
-- useful in practice, but upholds our semantics.
cseExpr :: CSEnv -> InExpr -> OutExpr
cseExpr env (Type t) = Type (substTy (csEnvSubst env) t)
......@@ -235,32 +312,25 @@ cseExpr env (Lam b e) = let (env', b') = addBinder env b
in Lam b' (cseExpr env' e)
cseExpr env (Let bind e) = let (env', bind') = cseBind env bind
in Let bind' (cseExpr env' e)
cseExpr env (Case scrut bndr ty alts) = Case scrut' bndr''' ty alts'
where
alts' = cseAlts env2 scrut' bndr bndr'' alts
(env1, bndr') = addBinder env bndr
bndr'' = zapIdOccInfo bndr'
-- The swizzling from Note [Case binders 2] may
-- cause a dead case binder to be alive, so we
-- play safe here and bring them all to life
(env2, (bndr''', scrut')) = cseRhs env1 (bndr'', scrut)
-- Note [CSE for case expressions]
cseAlts :: CSEnv -> OutExpr -> InBndr -> InBndr -> [InAlt] -> [OutAlt]
cseAlts env scrut' bndr bndr' alts
= map cse_alt alts
cseExpr env (Case e bndr ty alts) = cseCase env e bndr ty alts
cseCase :: CSEnv -> InExpr -> InId -> InType -> [InAlt] -> OutExpr
cseCase env scrut bndr ty alts
= Case scrut' bndr3 ty (map cse_alt alts)
where
scrut'' = stripTicksTopE tickishFloatable scrut'
(con_target, alt_env)
= case scrut'' of
Var v' -> (v', extendCSSubst env bndr scrut'') -- See Note [Case binders 1]
-- map: bndr -> v'
bndr1 = zapIdOccInfo bndr
-- Zapping the OccInfo is needed because the extendCSEnv
-- in cse_alt may mean that a dead case binder
-- becomes alive, and Lint rejects that
(env1, bndr2) = addBinder env bndr1
(alt_env, (bndr3, scrut')) = cseRhs env1 bndr bndr2 scrut
-- cseRhs: see Note [CSE for case expressions]
_ -> (bndr', extendCSEnv env scrut' bndr') -- See Note [Case binders 2]
-- map: scrut' -> bndr'
con_target :: OutExpr
con_target = lookupSubst alt_env bndr
arg_tys = tyConAppArgs (idType bndr)
arg_tys :: [OutType]
arg_tys = tyConAppArgs (idType bndr3)
cse_alt (DataAlt con, args, rhs)
| not (null args)
......@@ -289,29 +359,36 @@ cseAlts env scrut' bndr bndr' alts
-}
type InExpr = CoreExpr -- Pre-cloning
type InBndr = CoreBndr
type InId = Id
type InAlt = CoreAlt
type InType = Type
type OutExpr = CoreExpr -- Post-cloning
type OutBndr = CoreBndr
type OutAlt = CoreAlt
type OutId = Id
type OutType = Type
data CSEnv
= CS { cs_subst :: Subst -- Maps InBndrs to OutExprs
-- The substitution variables to
-- /trivial/ OutExprs, not arbitrary expressions
data CSEnv = CS { cs_map :: CoreMap (OutExpr, Id) -- Key, value
, cs_subst :: Subst }
, cs_map :: CoreMap OutExpr -- The reverse mapping
-- Maps a OutExpr to a /trivial/ OutExpr
-- The key of cs_map is stripped of all Ticks
}
emptyCSEnv :: CSEnv
emptyCSEnv = CS { cs_map = emptyCoreMap, cs_subst = emptySubst }
lookupCSEnv :: CSEnv -> OutExpr -> Maybe Id
lookupCSEnv :: CSEnv -> OutExpr -> Maybe OutExpr
lookupCSEnv (CS { cs_map = csmap }) expr
= case lookupCoreMap csmap expr of
Just (_,e) -> Just e
Nothing -> Nothing
extendCSEnv :: CSEnv -> OutExpr -> Id -> CSEnv
extendCSEnv cse expr id
= cse { cs_map = extendCoreMap (cs_map cse) sexpr (sexpr,id) }
where sexpr = stripTicksE tickishFloatable expr
= lookupCoreMap csmap expr
extendCSEnv :: CSEnv -> OutExpr -> OutExpr -> CSEnv
extendCSEnv cse expr triv_expr
= cse { cs_map = extendCoreMap (cs_map cse) sexpr triv_expr }
where
sexpr = stripTicksE tickishFloatable expr
csEnvSubst :: CSEnv -> Subst
csEnvSubst = cs_subst
......
==================== Tidy Core ====================
Result size of Tidy Core = {terms: 32, types: 17, coercions: 0}
Result size of Tidy Core = {terms: 46, types: 23, coercions: 0}
-- RHS size: {terms: 2, types: 0, coercions: 0}
T7116.$trModule2 :: GHC.Types.TrName
......@@ -47,7 +47,7 @@ dr =
\ (x :: Double) ->
case x of { GHC.Types.D# x1 -> GHC.Types.D# (GHC.Prim.+## x1 x1) }
-- RHS size: {terms: 1, types: 0, coercions: 0}
-- RHS size: {terms: 8, types: 3, coercions: 0}
dl :: Double -> Double
[GblId,
Arity=1,
......@@ -58,7 +58,9 @@ dl :: Double -> Double
Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
Tmpl= \ (x [Occ=Once!] :: Double) ->
case x of { GHC.Types.D# y -> GHC.Types.D# (GHC.Prim.+## y y) }}]
dl = dr
dl =
\ (x :: Double) ->
case x of { GHC.Types.D# y -> GHC.Types.D# (GHC.Prim.+## y y) }
-- RHS size: {terms: 8, types: 3, coercions: 0}
fr :: Float -> Float
......@@ -79,7 +81,7 @@ fr =
GHC.Types.F# (GHC.Prim.plusFloat# x1 x1)
}
-- RHS size: {terms: 1, types: 0, coercions: 0}
-- RHS size: {terms: 8, types: 3, coercions: 0}
fl :: Float -> Float
[GblId,
Arity=1,
......@@ -92,7 +94,11 @@ fl :: Float -> Float
case x of { GHC.Types.F# y ->
GHC.Types.F# (GHC.Prim.plusFloat# y y)
}}]
fl = fr
fl =
\ (x :: Float) ->
case x of { GHC.Types.F# y ->
GHC.Types.F# (GHC.Prim.plusFloat# y y)
}
......@@ -624,13 +624,14 @@ test('T9020',
[(wordsize(32), 343005716, 10),
# Original: 381360728
# 2014-07-31: 343005716 (Windows) (general round of updates)
(wordsize(64), 698401736, 10)])
(wordsize(64), 852298336, 10)])
# prev: 795469104
# 2014-07-17: 728263536 (general round of updates)
# 2014-09-10: 785871680 post-AMP-cleanup
# 2014-11-03: 680162056 Further Applicative and Monad adjustments
# 2015-10-21: 786189008 Make stronglyConnCompFromEdgedVertices deterministic
# 2016-01-26: 698401736 improvement from using ExpTypes instead of ReturnTvs
# 2016-04-06: 852298336 Refactoring of CSE #11781
],
compile,[''])
......@@ -695,12 +696,13 @@ test('T9872a',
test('T9872b',
[ only_ways(['normal']),
compiler_stats_num_field('bytes allocated',
[(wordsize(64), 4918990352, 5),
[(wordsize(64), 4600233488, 5),
# 2014-12-10 6483306280 Initally created
# 2014-12-16 6892251912 Flattener parameterized over roles
# 2014-12-18 3480212048 Reduce type families even more eagerly
# 2015-12-11 5199926080 TypeInType (see #11196)
# 2016-02-08 4918990352 Improved a bit by tyConRolesRepresentational
# 2016-04-06: 4600233488 Refactoring of CSE #11781
(wordsize(32), 2422750696, 5)
# was 1700000000
# 2016-04-06 2422750696 x86/Linux
......@@ -711,12 +713,13 @@ test('T9872b',
test('T9872c',
[ only_ways(['normal']),
compiler_stats_num_field('bytes allocated',
[(wordsize(64), 4454071184, 5),
[(wordsize(64), 4306667256, 5),
# 2014-12-10 5495850096 Initally created
# 2014-12-16 5842024784 Flattener parameterized over roles
# 2014-12-18 2963554096 Reduce type families even more eagerly
# 2015-12-11 4723613784 TypeInType (see #11196)
# 2016-02-08 4454071184 Improved a bit by tyConRolesRepresentational
# 2016-04-06: 4306667256 Refactoring of CSE #11781
(wordsize(32), 2257242896, 5)
# was 1500000000
# 2016-04-06 2257242896
......
......@@ -51,7 +51,7 @@ test('haddock.base',
test('haddock.Cabal',
[unless(in_tree_compiler(), skip), req_haddock
,stats_num_field('bytes allocated',
[(wordsize(64), 10941742184, 5)
[(wordsize(64), 11542374816, 5)
# 2012-08-14: 3255435248 (amd64/Linux)
# 2012-08-29: 3324606664 (amd64/Linux, new codegen)
# 2012-10-08: 3373401360 (amd64/Linux)
......@@ -77,6 +77,7 @@ test('haddock.Cabal',
# 2015-12-22: 10519532424 (amd64/Linux) - Lots of new Semigroup instances in Cabal
# 2016-03-29: 11517963232 (amd64/Linux) - not yet investigated
# 2016-03-30: 10941742184 (amd64/Linux) - defer inlining of Int* Ord methods
# 2016-04-06: 11542374816 (amd64/Linux) - CSE improvements and others
,(platform('i386-unknown-mingw32'), 3293415576, 5)
# 2012-10-30: 1733638168 (x86/Windows)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment