Commit 9dbf66d7 authored by Ben Gamari's avatar Ben Gamari 🐢

Revert "Simplify callSiteInline a little"

This lead to some rather significant performance regressions.
Specifically,

    bytes allocated value is too high:
        Expected    T5631(normal) bytes allocated: 1106015512 +/-5%
        Lower bound T5631(normal) bytes allocated: 1050714736
        Upper bound T5631(normal) bytes allocated: 1161316288
        Actual      T5631(normal) bytes allocated: 1164953208
        Deviation   T5631(normal) bytes allocated:        5.3 %
    *** unexpected stat test failure for T5631(normal)
    max_bytes_used value is too high:
        Expected    T9630(normal) max_bytes_used: 35324712 +/-15%
        Lower bound T9630(normal) max_bytes_used: 30026005
        Upper bound T9630(normal) max_bytes_used: 40623419
        Actual      T9630(normal) max_bytes_used: 43490984
        Deviation   T9630(normal) max_bytes_used:     23.1 %
    *** unexpected stat test failure for T9630(normal)

This reverts commit 7271db46.
This reverts commit b750dcc5.
This reverts commit 33de71fa.
parent df6670e7
......@@ -1151,11 +1151,11 @@ callSiteInline dflags id active_unfolding lone_variable arg_infos cont_info
-- idUnfolding checks for loop-breakers, returning NoUnfolding
-- Things with an INLINE pragma may have an unfolding *and*
-- be a loop breaker (maybe the knot is not yet untied)
CoreUnfolding { uf_tmpl = unf_template
CoreUnfolding { uf_tmpl = unf_template, uf_is_top = is_top
, uf_is_work_free = is_wf
, uf_guidance = guidance, uf_expandable = is_exp }
| active_unfolding -> tryUnfolding dflags id lone_variable
arg_infos cont_info unf_template
arg_infos cont_info unf_template is_top
is_wf is_exp guidance
| otherwise -> traceInline dflags id "Inactive unfolding:" (ppr id) Nothing
NoUnfolding -> Nothing
......@@ -1175,10 +1175,10 @@ traceInline dflags inline_id str doc result
= result
tryUnfolding :: DynFlags -> Id -> Bool -> [ArgSummary] -> CallCtxt
-> CoreExpr -> Bool -> Bool -> UnfoldingGuidance
-> CoreExpr -> Bool -> Bool -> Bool -> UnfoldingGuidance
-> Maybe CoreExpr
tryUnfolding dflags id lone_variable
arg_infos cont_info unf_template
arg_infos cont_info unf_template is_top
is_wf is_exp guidance
= case guidance of
UnfNever -> traceInline dflags id str (text "UnfNever") Nothing
......@@ -1250,10 +1250,10 @@ tryUnfolding dflags id lone_variable
CaseCtxt -> not (lone_variable && is_exp) -- Note [Lone variables]
ValAppCtxt -> True -- Note [Cast then apply]
RuleArgCtxt -> uf_arity > 0 -- See Note [Unfold info lazy contexts]
DiscArgCtxt -> uf_arity > 0 -- Note [Inlining in ArgCtxt]
DiscArgCtxt -> uf_arity > 0 --
RhsCtxt -> uf_arity > 0 --
_other -> False -- See Note [Nested functions]
_ -> not is_top && uf_arity > 0 -- Note [Nested functions]
-- Note [Inlining in ArgCtxt]
{-
Note [Unfold into lazy contexts], Note [RHS of lets]
......@@ -1323,17 +1323,18 @@ However for worker/wrapper it may be worth inlining even if the
arity is not satisfied (as we do in the CoreUnfolding case) so we don't
require saturation.
Note [Nested functions]
~~~~~~~~~~~~~~~~~~~~~~~
At one time we treated a call of a non-top-level function as
"interesting" (regardless of how boring the context) in the hope
that inlining it would eliminate the binding, and its allocation.
Specifically, in the default case of interesting_call we had
_other -> not is_top && uf_arity > 0
But actually postInlineUnconditionally does some of this and overall
it makes virtually no difference to nofib. So I simplified away this
special case
If a function has a nested defn we also record some-benefit, on the
grounds that we are often able to eliminate the binding, and hence the
allocation, for the function altogether; this is good for join points.
But this only makes sense for *functions*; inlining a constructor
doesn't help allocation unless the result is scrutinised. UNLESS the
constructor occurs just once, albeit possibly in multiple case
branches. Then inlining it doesn't increase allocation, but it does
increase the chance that the constructor won't be allocated at all in
the branches that don't use it.
Note [Cast then apply]
~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -449,7 +449,7 @@ test('T5631',
# 2014-04-04: 346389856 (x86 Windows, 64 bit machine)
# 2014-12-01: 390199244 (Windows laptop)
# 2016-04-06: 570137436 (amd64/Linux) many reasons
(wordsize(64), 1164944688, 5)]),
(wordsize(64), 1106015512, 5)]),
# expected value: 774595008 (amd64/Linux):
# expected value: 735486328 (amd64/Linux) 2012/12/12:
# expected value: 690742040 (amd64/Linux) Call Arity improvements
......@@ -464,9 +464,8 @@ test('T5631',
# 2017-02-17: 1517484488 (amd64/Linux) Type-indexed Typeable
# 2017-03-03: 1065147968 (amd64/Linux) Share Typeable KindReps
# 2017-03-31: 1037482512 (amd64/Linux) Fix memory leak in simplifier
# 2017-07-27: 1106015512 (Mac) Regresssion from tracking visibility in
# TypeEqOrigin should be fixed by #14037
# 2018-05-05: 1164944688 (amd64/Linux) Simplify callSiteInline a little
# 2017-07-27: 1106015512 (Mac) Regresssion from tracking visibility in TypeEqOrigin
# should be fixed by #14037
only_ways(['normal'])
],
compile,
......@@ -1271,11 +1270,10 @@ test ('T9630',
[ compiler_stats_num_field('max_bytes_used', # Note [residency]
[(platform('x86_64-unknown-mingw32'), 39867088, 15),
# 2017-12-24: 34171816 (x64/Windows)
(wordsize(64), 42664296, 15)
(wordsize(64), 35324712, 15)
# initial: 56955240
# 2017-06-07: 41568168 Stop the specialiser generating loopy code
# 2018-02-25: 35324712 It's not entirely clear
# 2018-05-05: 42664296 Don't inline nested function bindings as aggressively
]),
extra_clean(['T9630a.hi', 'T9630a.o'])
],
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment