Commit a1b753e8 authored by Simon Peyton Jones's avatar Simon Peyton Jones

Cure exponential behaviour in the simplifier

This patch nails a Bad Bug exposed in Trac #13379. Roughly,
a deeply-nested application like
   f (f (f ....) ) )
could make the simplifier go exponential -- without producing
an exponential-sized result!

The reason was that we
  - simplified a (big) function argument
  - then decided to inline the function
  - then preInilneUnconditionally the argument
  - and then re-simplified the big argument

And if the "big argument" itself had a similar structure
things could get very bad.

Once I'd understood, it was easy to fix:

* See Note Note [Avoiding exponential behaviour] for an overview

* The key change is that Simplify.simplLam now as a case for
  (isSimplified dup). This is what removes the perf bug.

* But I also made simplCast more parsimonious about simplifying,
  avoiding doing so when the coercion is Refl

* And similarly I now try to avoid simplifying arguments
  where possible before applying rules.
  See Note [Trying rewrite rules]

The latter two points tackle common cases, and in those cases make the
simplifier take fewer iterations.
parent 25754c83
......@@ -19,7 +19,7 @@ module SimplUtils (
-- The continuation type
SimplCont(..), DupFlag(..),
isSimplified,
contIsDupable, contResultType, contHoleType, applyContToJoinType,
contIsDupable, contResultType, contHoleType,
contIsTrivial, contArgs,
countArgs,
mkBoringStop, mkRhsStop, mkLazyArgStop, contIsRhsOrArg,
......@@ -221,9 +221,10 @@ data ArgInfo
= ArgInfo {
ai_fun :: OutId, -- The function
ai_args :: [ArgSpec], -- ...applied to these args (which are in *reverse* order)
ai_type :: OutType, -- Type of (f a1 ... an)
ai_rules :: [CoreRule], -- Rules for this function
ai_rules :: FunRules, -- Rules for this function
ai_encl :: Bool, -- Flag saying whether this function
-- or an enclosing one has rules (recursively)
......@@ -250,11 +251,13 @@ instance Outputable ArgSpec where
addValArgTo :: ArgInfo -> OutExpr -> ArgInfo
addValArgTo ai arg = ai { ai_args = ValArg arg : ai_args ai
, ai_type = applyTypeToArg (ai_type ai) arg }
, ai_type = applyTypeToArg (ai_type ai) arg
, ai_rules = decRules (ai_rules ai) }
addTyArgTo :: ArgInfo -> OutType -> ArgInfo
addTyArgTo ai arg_ty = ai { ai_args = arg_spec : ai_args ai
, ai_type = piResultTy poly_fun_ty arg_ty }
, ai_type = piResultTy poly_fun_ty arg_ty
, ai_rules = decRules (ai_rules ai) }
where
poly_fun_ty = ai_type ai
arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = poly_fun_ty }
......@@ -293,6 +296,20 @@ argInfoExpr fun rev_args
go (CastBy co : as) = mkCast (go as) co
type FunRules = Maybe (Int, [CoreRule]) -- Remaining rules for this function
-- Nothing => No rules
-- Just (n, rules) => some rules, requiring at least n more type/value args
decRules :: FunRules -> FunRules
decRules (Just (n, rules)) = Just (n-1, rules)
decRules Nothing = Nothing
mkFunRules :: [CoreRule] -> FunRules
mkFunRules [] = Nothing
mkFunRules rs = Just (n_required, rs)
where
n_required = maximum (map ruleArity rs)
{-
************************************************************************
* *
......@@ -362,10 +379,6 @@ contHoleType (ApplyToVal { sc_arg = e, sc_env = se, sc_dup = dup, sc_cont = k })
contHoleType (Select { sc_dup = d, sc_bndr = b, sc_env = se })
= perhapsSubstTy d se (idType b)
applyContToJoinType :: JoinArity -> SimplCont -> OutType -> OutType
applyContToJoinType ar cont ty
= setJoinResTy ar (contResultType cont) ty
-------------------
countArgs :: SimplCont -> Int
-- Count all arguments, including types, coercions, and other values
......@@ -407,18 +420,20 @@ mkArgInfo :: Id
mkArgInfo fun rules n_val_args call_cont
| n_val_args < idArity fun -- Note [Unsaturated functions]
= ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty
, ai_rules = rules, ai_encl = False
, ai_rules = fun_rules, ai_encl = False
, ai_strs = vanilla_stricts
, ai_discs = vanilla_discounts }
| otherwise
= ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty
, ai_rules = rules
, ai_rules = fun_rules
, ai_encl = interestingArgContext rules call_cont
, ai_strs = add_type_str fun_ty arg_stricts
, ai_discs = arg_discounts }
where
fun_ty = idType fun
fun_rules = mkFunRules rules
vanilla_discounts, arg_discounts :: [Int]
vanilla_discounts = repeat 0
arg_discounts = case idUnfolding fun of
......
......@@ -487,7 +487,6 @@ completeNonRecX top_lvl env is_strict old_bndr new_bndr new_rhs
; completeBind env2 NotTopLevel NonRecursive Nothing
old_bndr new_bndr rhs2 }
{-
{- No, no, no! Do not try preInlineUnconditionally in completeNonRecX
Doing so risks exponential behaviour, because new_rhs has been simplified once already
In the cases described by the following comment, postInlineUnconditionally will
......@@ -514,6 +513,36 @@ completeNonRecX top_lvl env is_strict old_bndr new_bndr new_rhs
-}
----------------------------------
{- Note [Avoiding exponential behaviour]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
One way in which we can get exponential behaviour is if we simplify a
big expression, and the re-simplify it -- and then this happens in a
deeply-nested way. So we must be jolly careful about re-simplifying
an expression. That is why completeNonRecX does not try
preInlineUnconditionally.
Example:
f BIG, where f has a RULE
Then
* We simplify BIG before trying the rule; but the rule does not fire
* We inline f = \x. x True
* So if we did preInlineUnconditionally we'd re-simplify (BIG True)
However, if BIG has /not/ already been simplified, we'd /like/ to
simplify BIG True; maybe good things happen. That is why
* simplLam has
- a case for (isSimplified dup), which goes via simplNonRecX, and
- a case for the un-simplified case, which goes via simplNonRecE
* We go to some efforts to avoid unnecessarily simplifying ApplyToVal,
in at least two places
- In simplCast/addCoerce, where we check for isReflCo
- In rebuildCall we avoid simplifying arguments before we have to
(see Note [Trying rewrite rules])
Note [prepareRhs]
~~~~~~~~~~~~~~~~~~~~
prepareRhs takes a putative RHS, checks whether it's a PAP or
constructor application and, if so, converts it to ANF, so that the
resulting thing can be inlined more easily. Thus
......@@ -532,6 +561,7 @@ That's what the 'go' loop in prepareRhs does
prepareRhs :: TopLevelFlag -> SimplEnv -> OutId -> OutExpr -> SimplM (SimplEnv, OutExpr)
-- Adds new floats to the env iff that allows us to return a good RHS
-- See Note [prepareRhs]
prepareRhs top_lvl env id (Cast rhs co) -- Note [Float coercions]
| Pair ty1 _ty2 <- coercionKind co -- Do *not* do this if rhs has an unlifted type
, not (isUnliftedType ty1) -- see Note [Float coercions (unlifted)]
......@@ -1005,15 +1035,15 @@ simplExprF :: SimplEnv
-> SimplM (SimplEnv, OutExpr)
simplExprF env e cont
= {- pprTrace "simplExprF" (vcat
[ ppr e
, text "cont =" <+> ppr cont
, text "inscope =" <+> ppr (seInScope env)
, text "tvsubst =" <+> ppr (seTvSubst env)
, text "idsubst =" <+> ppr (seIdSubst env)
, text "cvsubst =" <+> ppr (seCvSubst env)
{- , ppr (seFloats env) -}
]) $ -}
= -- pprTrace "simplExprF" (vcat
-- [ ppr e
-- , text "cont =" <+> ppr cont
-- , text "inscope =" <+> ppr (seInScope env)
-- , text "tvsubst =" <+> ppr (seTvSubst env)
-- , text "idsubst =" <+> ppr (seIdSubst env)
-- , text "cvsubst =" <+> ppr (seCvSubst env)
-- {- , ppr (seFloats env) -}
-- ]) $
simplExprF1 env e cont
simplExprF1 :: SimplEnv -> InExpr -> SimplCont
......@@ -1387,24 +1417,29 @@ simplCast env body co0 cont0
= do { tail' <- addCoerce co' tail
; return (cont { sc_arg_ty = arg_ty', sc_cont = tail' }) }
addCoerce co (ApplyToVal { sc_arg = arg, sc_env = arg_se
addCoerce co cont@(ApplyToVal { sc_arg = arg, sc_env = arg_se
, sc_dup = dup, sc_cont = tail })
| Just (co1, co2) <- pushCoValArg co
, Pair _ new_ty <- coercionKind co1
, not (isTypeLevPoly new_ty) -- without this check, we get a lev-poly arg
-- See Note [Levity polymorphism invariants] in CoreSyn
-- test: typecheck/should_run/EtaExpandLevPoly
= do { (dup', arg_se', arg') <- simplArg env dup arg_se arg
= do { tail' <- addCoerce co2 tail
; if isReflCo co1
then return (cont { sc_cont = tail' })
-- Avoid simplifying if possible;
-- See Note [Avoiding exponential behaviour]
else do
{ (dup', arg_se', arg') <- simplArg env dup arg_se arg
-- When we build the ApplyTo we can't mix the OutCoercion
-- 'co' with the InExpr 'arg', so we simplify
-- to make it all consistent. It's a bit messy.
-- But it isn't a common case.
-- Example of use: Trac #995
; tail' <- addCoerce co2 tail
; return (ApplyToVal { sc_arg = mkCast arg' co1
, sc_env = arg_se'
, sc_dup = dup'
, sc_cont = tail' }) }
, sc_cont = tail' }) } }
addCoerce co cont
| isReflexiveCo co = return cont
......@@ -1457,13 +1492,20 @@ simplLam env (bndr:bndrs) body (ApplyToTy { sc_arg_ty = arg_ty, sc_cont = cont }
; simplLam (extendTvSubst env bndr arg_ty) bndrs body cont }
simplLam env (bndr:bndrs) body (ApplyToVal { sc_arg = arg, sc_env = arg_se
, sc_cont = cont })
, sc_cont = cont, sc_dup = dup })
| isSimplified dup -- Don't re-simplify if we've simplified it once
-- See Note [Avoiding exponential behaviour]
= do { tick (BetaReduction bndr)
; env' <- simplNonRecX env zapped_bndr arg
; simplLam env' bndrs body cont }
| otherwise
= do { tick (BetaReduction bndr)
; simplNonRecE env (zap_unfolding bndr) (arg, arg_se) (bndrs, body) cont }
; simplNonRecE env zapped_bndr (arg, arg_se) (bndrs, body) cont }
where
zap_unfolding bndr -- See Note [Zap unfolding when beta-reducing]
zapped_bndr -- See Note [Zap unfolding when beta-reducing]
| isId bndr, isStableUnfolding (realIdUnfolding bndr)
= setIdUnfolding bndr NoUnfolding
= setIdUnfolding bndr NoUnfolding
| otherwise = bndr
-- discard a non-counting tick on a lambda. This may change the
......@@ -1506,7 +1548,8 @@ simplLamBndr env bndr
------------------
simplNonRecE :: SimplEnv
-> InId -- The binder, always an Id for simplNonRecE
-> InId -- The binder, always an Id
-- Can be a join point
-> (InExpr, SimplEnv) -- Rhs of binding (or arg of lambda)
-> ([InBndr], InExpr) -- Body of the let/lambda
-- \xs.e
......@@ -1601,7 +1644,7 @@ simplRecE env pairs body cont
; env1 <- simplRecBndrs env bndrs
-- NB: bndrs' don't have unfoldings or rules
-- We add them as we go down
; env2 <- simplRecBind env1 NotTopLevel (Just cont) pairs
; env2 <- simplRecBind env1 NotTopLevel Nothing pairs
; simplExprF env2 body cont }
......@@ -1673,18 +1716,17 @@ completeCall env var cont
unfolding = activeUnfolding env var
maybe_inline = callSiteInline dflags var unfolding
lone_variable arg_infos interesting_cont
; case maybe_inline of {
; case maybe_inline of
Just expr -- There is an inlining!
-> do { checkedTick (UnfoldingDone var)
; dump_inline dflags expr cont
; simplExprF (zapSubstEnv env) expr cont }
; Nothing -> do -- No inlining!
{ rule_base <- getSimplRules
; let info = mkArgInfo var (getRules rule_base var) n_val_args call_cont
; rebuildCall env info cont
}}}
; Nothing -> do { rule_base <- getSimplRules
; let info = mkArgInfo var (getRules rule_base var)
n_val_args call_cont
; rebuildCall env info cont }
}
where
dump_inline dflags unfolding cont
| not (dopt Opt_D_dump_inlinings dflags) = return ()
......@@ -1702,6 +1744,12 @@ rebuildCall :: SimplEnv
-> ArgInfo
-> SimplCont
-> SimplM (SimplEnv, OutExpr)
-- We decided not to inline, so
-- - simplify the arguments
-- - try rewrite rules
-- - and rebuild
---------- Bottoming applications --------------
rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args, ai_strs = [] }) cont
-- When we run out of strictness args, it means
-- that the call is definitely bottom; see SimplUtils.mkArgInfo
......@@ -1722,11 +1770,32 @@ rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args, ai_strs = [] }) con
res = argInfoExpr fun rev_args
cont_ty = contResultType cont
---------- Try rewrite RULES --------------
-- See Note [Trying rewrite rules]
rebuildCall env info@(ArgInfo { ai_fun = fun, ai_args = rev_args
, ai_rules = Just (nr_wanted, rules) }) cont
| nr_wanted == 0 || no_more_args
, let info' = info { ai_rules = Nothing }
= -- We've accumulated a simplified call in <fun,rev_args>
-- so try rewrite rules; see Note [RULEs apply to simplified arguments]
-- See also Note [Rules for recursive functions]
do { mb_match <- tryRules env rules fun (reverse rev_args) cont
; case mb_match of
Just (env', rhs, cont') -> simplExprF env' rhs cont'
Nothing -> rebuildCall env info' cont }
where
no_more_args = case cont of
ApplyToTy {} -> False
ApplyToVal {} -> False
_ -> True
---------- Simplify applications and casts --------------
rebuildCall env info (CastIt co cont)
= rebuildCall env (addCastTo info co) cont
rebuildCall env info (ApplyToTy { sc_arg_ty = arg_ty, sc_cont = cont })
= rebuildCall env (info `addTyArgTo` arg_ty) cont
= rebuildCall env (addTyArgTo info arg_ty) cont
rebuildCall env info@(ArgInfo { ai_encl = encl_rules, ai_type = fun_ty
, ai_strs = str:strs, ai_discs = disc:discs })
......@@ -1755,23 +1824,32 @@ rebuildCall env info@(ArgInfo { ai_encl = encl_rules, ai_type = fun_ty
| disc > 0 = DiscArgCtxt -- Be keener here
| otherwise = BoringCtxt -- Nothing interesting
rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args, ai_rules = rules }) cont
| null rules
= rebuild env (argInfoExpr fun rev_args) cont -- No rules, common case
---------- No further useful info, revert to generic rebuild ------------
rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args }) cont
= rebuild env (argInfoExpr fun rev_args) cont
| otherwise
= do { -- We've accumulated a simplified call in <fun,rev_args>
-- so try rewrite rules; see Note [RULEs apply to simplified arguments]
-- See also Note [Rules for recursive functions]
mb_rule <- tryRules env rules fun (reverse rev_args) cont
; case mb_rule of {
Just (env', rule_rhs, cont') -> simplExprF env' rule_rhs cont'
{- Note [Trying rewrite rules]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider an application (f e1 e2 e3) where the e1,e2,e3 are not yet
simplified. We want to simplify enough arguments to allow the rules
to apply, but it's more efficient to avoid simplifying e2,e3 if e1 alone
is sufficient. Example: class ops
(+) dNumInt e2 e3
If we rewrite ((+) dNumInt) to plusInt, we can take advantage of the
latter's strictness when simplifying e2, e3. Moreover, suppose we have
RULE f Int = \x. x True
Then given (f Int e1) we rewrite to
(\x. x True) e1
without simpifying e1. Now we can inline x into its unique call site,
and absorb the True into it all in the same pass. If we simplified
e1 first, we couldn't do that; see Note [Avoiding exponential behaviour].
So we try to apply rules if either
(a) no_more_args: we've run out of argument that the rules can "see"
(b) nr_wanted: none of the rules wants any more arguments
-- Rules don't match
; Nothing -> rebuild env (argInfoExpr fun rev_args) cont -- No rules
} }
{-
Note [RULES apply to simplified arguments]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It's very desirable to try RULES once the arguments have been simplified, because
......@@ -3254,10 +3332,12 @@ simplLetUnfolding env top_lvl cont_mb id new_rhs unf
is_top_lvl = isTopLevel top_lvl
is_bottoming = isBottomingId id
simplUnfolding :: SimplEnv -> TopLevelFlag -> Maybe SimplCont -> InId
simplUnfolding :: SimplEnv -> TopLevelFlag
-> Maybe SimplCont -- Just k => a join point with continuation k
-> InId
-> Unfolding -> SimplM Unfolding
-- Note [Setting the new unfolding]
simplUnfolding env top_lvl cont_mb id unf
simplUnfolding env top_lvl mb_cont id unf
= case unf of
NoUnfolding -> return unf
BootUnfolding -> return unf
......@@ -3270,10 +3350,9 @@ simplUnfolding env top_lvl cont_mb id unf
CoreUnfolding { uf_tmpl = expr, uf_src = src, uf_guidance = guide }
| isStableSource src
-> do { expr' <- if isJoinId id
then let Just cont = cont_mb
in simplJoinRhs rule_env id expr cont
else simplExpr rule_env expr
-> do { expr' <- case mb_cont of
Just cont -> simplJoinRhs rule_env id expr cont
Nothing -> simplExpr rule_env expr
; case guide of
UnfWhen { ug_arity = arity, ug_unsat_ok = sat_ok } -- Happens for INLINE things
-> let guide' = UnfWhen { ug_arity = arity, ug_unsat_ok = sat_ok
......
This diff is collapsed.
Rule fired: Class op foldr (BUILTIN)
Rule fired: Class op >> (BUILTIN)
Rule fired: Class op return (BUILTIN)
Rule fired: unpack (GHC.Base)
Rule fired: Class op foldr (BUILTIN)
Rule fired: fold/build (GHC.Base)
Rule fired: <# (BUILTIN)
Rule fired: tagToEnum# (BUILTIN)
......
......@@ -721,7 +721,8 @@ test('T9020',
# Original: 381360728
# 2014-07-31: 343005716 (Windows) (general round of updates)
# 2017-03-24: 249904136 (x86/Linux, 64-bit machine)
(wordsize(64), 493596312, 10)])
(wordsize(64), 423163832, 10)])
# prev: 795469104
# 2014-07-17: 728263536 (general round of updates)
# 2014-09-10: 785871680 post-AMP-cleanup
......@@ -734,6 +735,7 @@ test('T9020',
# 2017-02-14: 500707080 Early inline patch; 35% decrease!
# Program size collapses in first simplification
# 2017-03-31: 493596312 Fix memory leak in simplifier
# 2017-04-28: 423163832 Remove exponential behaviour in simplifier
],
compile,[''])
......@@ -1003,12 +1005,13 @@ test('T12227',
test('T12425',
[ only_ways(['optasm']),
compiler_stats_num_field('bytes allocated',
[(wordsize(64), 134334800, 5),
[(wordsize(64), 127500136, 5),
# initial: 125831400
# 2017-01-18: 133380960 Allow top-level string literals in Core
# 2017-02-17: 153611448 Type-indexed Typeable
# 2017-03-03: 142256192 Share Typeable KindReps
# 2017-03-21: 134334800 Unclear
# 2017-04-28: 127500136 Remove exponential behaviour in simplifier
]),
],
compile,
......@@ -1076,3 +1079,12 @@ test('T12707',
],
compile,
[''])
test('T13379',
[ compiler_stats_num_field('bytes allocated',
[(wordsize(64), 411597856, 5),
# initial: 411597856
]),
],
compile,
[''])
......@@ -16,18 +16,18 @@ Total ticks: 55
1 c
1 n
1 g
1 a
1 xs
1 ys
1 c
1 n
1 k
1 z
1 g
1 a
1 a
1 g
1 h
1 n
1 a
1 k
1 z
1 g
1 lvl
1 lvl
1 lvl
......@@ -47,25 +47,25 @@ Total ticks: 55
1 a
1 g
1 a
1 b
1 a
1 xs
1 ys
1 b
1 c
1 n
1 a
1 b
1 a
1 k
1 z
1 g
1 a
1 g
1 h
1 b
1 c
1 n
1 b
1 a
1 k
1 z
1 g
10 SimplifierDone 10
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment