Commit 2effe18a authored by Simon Peyton Jones's avatar Simon Peyton Jones Committed by David Feuer

The Early Inline Patch

This very small patch switches on sm_inline even in the InitialPhase
(aka "gentle" phase).   There is no reason not to... and the results
are astonishing.

I think the peformance of GHC itself improves by about 5%; and some
programs get much smaller, quicker.  Result: across the board
irmprovements in
compile time performance.  Here are the changes in perf/compiler;
the numbers are decreases in compiler bytes-allocated:

  3%   T5837
  7%   parsing001
  9%   T12234
  35%  T9020
  9%   T3064
  13%  T9961
  20%  T13056
  5%   T9872d
  5%   T9872c
  5%   T9872b
  7%   T9872a
  5%   T783
  35%  T12227
  20%  T1969

Plus in perf/should_run

  5%   lazy-bs-alloc

It wasn't as easy as it sounds: I did a raft of preparatory work in
earlier patches.  But it's great!

Reviewers: austin, bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D3203
parent 55efc971
...@@ -132,6 +132,7 @@ getCoreToDo dflags ...@@ -132,6 +132,7 @@ getCoreToDo dflags
rules_on = gopt Opt_EnableRewriteRules dflags rules_on = gopt Opt_EnableRewriteRules dflags
eta_expand_on = gopt Opt_DoLambdaEtaExpansion dflags eta_expand_on = gopt Opt_DoLambdaEtaExpansion dflags
ww_on = gopt Opt_WorkerWrapper dflags ww_on = gopt Opt_WorkerWrapper dflags
vectorise_on = gopt Opt_Vectorise dflags
static_ptrs = xopt LangExt.StaticPointers dflags static_ptrs = xopt LangExt.StaticPointers dflags
maybe_rule_check phase = runMaybe rule_check (CoreDoRuleCheck phase) maybe_rule_check phase = runMaybe rule_check (CoreDoRuleCheck phase)
...@@ -160,12 +161,12 @@ getCoreToDo dflags ...@@ -160,12 +161,12 @@ getCoreToDo dflags
-- We need to eliminate these common sub expressions before their definitions -- We need to eliminate these common sub expressions before their definitions
-- are inlined in phase 2. The CSE introduces lots of v1 = v2 bindings, -- are inlined in phase 2. The CSE introduces lots of v1 = v2 bindings,
-- so we also run simpl_gently to inline them. -- so we also run simpl_gently to inline them.
++ (if gopt Opt_Vectorise dflags && phase == 3 ++ (if vectorise_on && phase == 3
then [CoreCSE, simpl_gently] then [CoreCSE, simpl_gently]
else []) else [])
vectorisation vectorisation
= runWhen (gopt Opt_Vectorise dflags) $ = runWhen vectorise_on $
CoreDoPasses [ simpl_gently, CoreDoVectorisation ] CoreDoPasses [ simpl_gently, CoreDoVectorisation ]
-- By default, we have 2 phases before phase 0. -- By default, we have 2 phases before phase 0.
...@@ -188,7 +189,8 @@ getCoreToDo dflags ...@@ -188,7 +189,8 @@ getCoreToDo dflags
(base_mode { sm_phase = InitialPhase (base_mode { sm_phase = InitialPhase
, sm_names = ["Gentle"] , sm_names = ["Gentle"]
, sm_rules = rules_on -- Note [RULEs enabled in SimplGently] , sm_rules = rules_on -- Note [RULEs enabled in SimplGently]
, sm_inline = False , sm_inline = not vectorise_on
-- See Note [Inline in InitialPhase]
, sm_case_case = False }) , sm_case_case = False })
-- Don't do case-of-case transformations. -- Don't do case-of-case transformations.
-- This makes full laziness work better -- This makes full laziness work better
...@@ -381,7 +383,35 @@ addPluginPasses builtin_passes ...@@ -381,7 +383,35 @@ addPluginPasses builtin_passes
query_plug todos (_, plug, options) = installCoreToDos plug options todos query_plug todos (_, plug, options) = installCoreToDos plug options todos
#endif #endif
{- {- Note [Inline in InitialPhase]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In GHC 8 and earlier we did not inline anything in the InitialPhase. But that is
confusing for users because when they say INLINE they expect the function to inline
right away.
So now we do inlining immediately, even in the InitialPhase, assuming that the
Id's Activation allows it.
This is a surprisingly big deal. Compiler performance improved a lot
when I made this change:
perf/compiler/T5837.run T5837 [stat too good] (normal)
perf/compiler/parsing001.run parsing001 [stat too good] (normal)
perf/compiler/T12234.run T12234 [stat too good] (optasm)
perf/compiler/T9020.run T9020 [stat too good] (optasm)
perf/compiler/T3064.run T3064 [stat too good] (normal)
perf/compiler/T9961.run T9961 [stat too good] (normal)
perf/compiler/T13056.run T13056 [stat too good] (optasm)
perf/compiler/T9872d.run T9872d [stat too good] (normal)
perf/compiler/T783.run T783 [stat too good] (normal)
perf/compiler/T12227.run T12227 [stat too good] (normal)
perf/should_run/lazy-bs-alloc.run lazy-bs-alloc [stat too good] (normal)
perf/compiler/T1969.run T1969 [stat too good] (normal)
perf/compiler/T9872a.run T9872a [stat too good] (normal)
perf/compiler/T9872c.run T9872c [stat too good] (normal)
perf/compiler/T9872b.run T9872b [stat too good] (normal)
perf/compiler/T9872d.run T9872d [stat too good] (normal)
Note [RULEs enabled in SimplGently] Note [RULEs enabled in SimplGently]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RULES are enabled when doing "gentle" simplification. Two reasons: RULES are enabled when doing "gentle" simplification. Two reasons:
......
...@@ -721,7 +721,8 @@ updModeForRules current_mode ...@@ -721,7 +721,8 @@ updModeForRules current_mode
{- Note [Simplifying rules] {- Note [Simplifying rules]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When simplifying a rule, refrain from any inlining or applying of other RULES. When simplifying a rule LHS, refrain from /any/ inlining or applying
of other RULES.
Doing anything to the LHS is plain confusing, because it means that what the Doing anything to the LHS is plain confusing, because it means that what the
rule matches is not what the user wrote. c.f. Trac #10595, and #10528. rule matches is not what the user wrote. c.f. Trac #10595, and #10528.
...@@ -868,11 +869,17 @@ continuation. ...@@ -868,11 +869,17 @@ continuation.
-} -}
activeUnfolding :: SimplEnv -> Id -> Bool activeUnfolding :: SimplEnv -> Id -> Bool
activeUnfolding env activeUnfolding env id
| not (sm_inline mode) = active_unfolding_minimal | isCompulsoryUnfolding (realIdUnfolding id)
| otherwise = case sm_phase mode of = True -- Even sm_inline can't override compulsory unfoldings
InitialPhase -> active_unfolding_gentle | otherwise
Phase n -> active_unfolding n = isActive (sm_phase mode) (idInlineActivation id)
&& sm_inline mode
-- `or` isStableUnfolding (realIdUnfolding id)
-- Inline things when
-- (a) they are active
-- (b) sm_inline says so, except that for stable unfoldings
-- (ie pragmas) we inline anyway
where where
mode = getMode env mode = getMode env
...@@ -891,35 +898,13 @@ getUnfoldingInRuleMatch env ...@@ -891,35 +898,13 @@ getUnfoldingInRuleMatch env
id_unf id | unf_is_active id = idUnfolding id id_unf id | unf_is_active id = idUnfolding id
| otherwise = NoUnfolding | otherwise = NoUnfolding
unf_is_active id unf_is_active id
| not (sm_rules mode) = active_unfolding_minimal id | not (sm_rules mode) = -- active_unfolding_minimal id
isStableUnfolding (realIdUnfolding id)
-- Do we even need to test this? I think this InScopeEnv
-- is only consulted if activeRule returns True, which
-- never happens if sm_rules is False
| otherwise = isActive (sm_phase mode) (idInlineActivation id) | otherwise = isActive (sm_phase mode) (idInlineActivation id)
active_unfolding_minimal :: Id -> Bool
-- Compuslory unfoldings only
-- Ignore SimplGently, because we want to inline regardless;
-- the Id has no top-level binding at all
--
-- NB: we used to have a second exception, for data con wrappers.
-- On the grounds that we use gentle mode for rule LHSs, and
-- they match better when data con wrappers are inlined.
-- But that only really applies to the trivial wrappers (like (:)),
-- and they are now constructed as Compulsory unfoldings (in MkId)
-- so they'll happen anyway.
active_unfolding_minimal id = isCompulsoryUnfolding (realIdUnfolding id)
active_unfolding :: PhaseNum -> Id -> Bool
active_unfolding n id = isActiveIn n (idInlineActivation id)
active_unfolding_gentle :: Id -> Bool
-- Anything that is early-active
-- See Note [Gentle mode]
active_unfolding_gentle id
= isInlinePragma prag
&& isEarlyActive (inlinePragmaActivation prag)
-- NB: wrappers are not early-active
where
prag = idInlinePragma id
---------------------- ----------------------
activeRule :: SimplEnv -> Activation -> Bool activeRule :: SimplEnv -> Activation -> Bool
-- Nothing => No rules at all -- Nothing => No rules at all
...@@ -1027,10 +1012,11 @@ Example ...@@ -1027,10 +1012,11 @@ Example
...fInt...fInt...fInt... ...fInt...fInt...fInt...
Here f occurs just once, in the RHS of fInt. But if we inline it there Here f occurs just once, in the RHS of fInt. But if we inline it there
we'll lose the opportunity to inline at each of fInt's call sites. it might make fInt look big, and we'll lose the opportunity to inline f
The INLINE pragma will only inline when the application is saturated at each of fInt's call sites. The INLINE pragma will only inline when
for exactly this reason; and we don't want PreInlineUnconditionally the application is saturated for exactly this reason; and we don't
to second-guess it. A live example is Trac #3736. want PreInlineUnconditionally to second-guess it. A live example is
Trac #3736.
c.f. Note [Stable unfoldings and postInlineUnconditionally] c.f. Note [Stable unfoldings and postInlineUnconditionally]
Note [Top-level bottoming Ids] Note [Top-level bottoming Ids]
......
...@@ -15,10 +15,9 @@ src<debug.hs:6:1-21> ...@@ -15,10 +15,9 @@ src<debug.hs:6:1-21>
src<debug.hs:(3,1)-(5,29)> src<debug.hs:(3,1)-(5,29)>
src<debug.hs:3:9> src<debug.hs:3:9>
src<debug.hs:4:9> src<debug.hs:4:9>
src<debug.hs:5:25-29> src<debug.hs:5:21-29>
src<debug.hs:5:9-29> src<debug.hs:5:9-29>
src<debug.hs:6:1-21> src<debug.hs:6:1-21>
src<debug.hs:6:16-21>
== CBE == == CBE ==
src<debug.hs:4:9> src<debug.hs:4:9>
89 89
...@@ -3,10 +3,11 @@ test('T10858', ...@@ -3,10 +3,11 @@ test('T10858',
[(platform('x86_64-unknown-mingw32'), 272402736, 8), [(platform('x86_64-unknown-mingw32'), 272402736, 8),
# 2017-02-19 272402736 (x64/Windows) - unknown # 2017-02-19 272402736 (x64/Windows) - unknown
(wordsize(64), 304094944, 8) ]), (wordsize(64), 275357824, 8) ]),
# Initial: 476296112 # Initial: 476296112
# 2016-12-19 247768192 Join points (#19288) # 2016-12-19 247768192 Join points (#19288)
# 2016-02-12 304094944 Type-indexed Typeable # 2016-02-12 304094944 Type-indexed Typeable
# 2016-02-25 275357824 Early inline patch
only_ways(['normal'])], only_ways(['normal'])],
compile, compile,
['-O']) ['-O'])
Rule fired: Class op signum Rule fired: Class op signum
Rule fired: Class op abs Rule fired: Class op abs
Rule fired: normalize/Double
Rule fired: Class op HEq_sc
Rule fired: Class op HEq_sc Rule fired: Class op HEq_sc
Rule fired: normalize/Double
Rule fired: Class op HEq_sc Rule fired: Class op HEq_sc
Rule fired: unpack
Rule fired: Class op >>
Rule fired: Class op return
Rule fired: Class op foldr
Rule fired: Class op >>
Rule fired: Class op return
Rule fired: Class op foldr
Rule fired: Class op >> Rule fired: Class op >>
Rule fired: Class op return Rule fired: Class op return
Rule fired: unpack
Rule fired: Class op foldr Rule fired: Class op foldr
Rule fired: fold/build Rule fired: fold/build
Rule fired: <# Rule fired: <#
......
...@@ -68,6 +68,7 @@ test('T1969', ...@@ -68,6 +68,7 @@ test('T1969',
# 2014-06-29 5949188 (x86/Linux) # 2014-06-29 5949188 (x86/Linux)
# 2015-07-11 6241108 (x86/Linux, 64bit machine) use +RTS -G1 # 2015-07-11 6241108 (x86/Linux, 64bit machine) use +RTS -G1
# 2016-04-06 9093608 (x86/Linux, 64bit machine) # 2016-04-06 9093608 (x86/Linux, 64bit machine)
(wordsize(64), 19924328, 15)]), (wordsize(64), 19924328, 15)]),
# 2014-09-10 10463640, 10 # post-AMP-update (somewhat stabelish) # 2014-09-10 10463640, 10 # post-AMP-update (somewhat stabelish)
# looks like the peak is around ~10M, but we're # looks like the peak is around ~10M, but we're
...@@ -81,6 +82,8 @@ test('T1969', ...@@ -81,6 +82,8 @@ test('T1969',
# 2015-10-28 15017528 (amd64/Linux) emit typeable at definition site # 2015-10-28 15017528 (amd64/Linux) emit typeable at definition site
# 2016-10-12 17285216 (amd64/Linux) it's not entirely clear why # 2016-10-12 17285216 (amd64/Linux) it's not entirely clear why
# 2017-02-01 19924328 (amd64/Linux) Join points (#12988) # 2017-02-01 19924328 (amd64/Linux) Join points (#12988)
# 2017-02-14 16393848 Early inline patch
compiler_stats_num_field('bytes allocated', compiler_stats_num_field('bytes allocated',
[(platform('i386-unknown-mingw32'), 301784492, 5), [(platform('i386-unknown-mingw32'), 301784492, 5),
# 215582916 (x86/Windows) # 215582916 (x86/Windows)
...@@ -97,7 +100,7 @@ test('T1969', ...@@ -97,7 +100,7 @@ test('T1969',
# 2014-06-29 303300692 (x86/Linux) # 2014-06-29 303300692 (x86/Linux)
# 2015-07-11 288699104 (x86/Linux, 64-bit machine) use +RTS -G1 # 2015-07-11 288699104 (x86/Linux, 64-bit machine) use +RTS -G1
# 2016-04-06 344730660 (x86/Linux, 64-bit machine) # 2016-04-06 344730660 (x86/Linux, 64-bit machine)
(wordsize(64), 831733376, 5)]), (wordsize(64), 695354904, 5)]),
# 2009-11-17 434845560 (amd64/Linux) # 2009-11-17 434845560 (amd64/Linux)
# 2009-12-08 459776680 (amd64/Linux) # 2009-12-08 459776680 (amd64/Linux)
# 2010-05-17 519377728 (amd64/Linux) # 2010-05-17 519377728 (amd64/Linux)
...@@ -119,6 +122,7 @@ test('T1969', ...@@ -119,6 +122,7 @@ test('T1969',
# 2015-10-28 695430728 (x86_64/Linux) emit Typeable at definition site # 2015-10-28 695430728 (x86_64/Linux) emit Typeable at definition site
# 2015-10-28 756138176 (x86_64/Linux) inst-decl defaults go via typechecker (#12220) # 2015-10-28 756138176 (x86_64/Linux) inst-decl defaults go via typechecker (#12220)
# 2017-02-17 831733376 (x86_64/Linux) Type-indexed Typeable # 2017-02-17 831733376 (x86_64/Linux) Type-indexed Typeable
# 2017-02-25 695354904 (x86_64/Linux) Early inlining patch
only_ways(['normal']), only_ways(['normal']),
extra_hc_opts('-dcore-lint -static'), extra_hc_opts('-dcore-lint -static'),
...@@ -316,7 +320,7 @@ test('T3064', ...@@ -316,7 +320,7 @@ test('T3064',
# 2014-12-22: 122836340 (Windows) Death to silent superclasses # 2014-12-22: 122836340 (Windows) Death to silent superclasses
# 2016-04-06: 153261024 (x86/Linux) probably wildcard refactor # 2016-04-06: 153261024 (x86/Linux) probably wildcard refactor
(wordsize(64), 306222424, 5)]), (wordsize(64), 259815560, 5)]),
# (amd64/Linux) (2011-06-28): 73259544 # (amd64/Linux) (2011-06-28): 73259544
# (amd64/Linux) (2013-02-07): 224798696 # (amd64/Linux) (2013-02-07): 224798696
# (amd64/Linux) (2013-08-02): 236404384, increase from roles # (amd64/Linux) (2013-08-02): 236404384, increase from roles
...@@ -340,6 +344,7 @@ test('T3064', ...@@ -340,6 +344,7 @@ test('T3064',
# (amd64/Linux) (2016-04-15): 287460128 Improvement due to using coercionKind instead # (amd64/Linux) (2016-04-15): 287460128 Improvement due to using coercionKind instead
# of zonkTcType (Trac #11882) # of zonkTcType (Trac #11882)
# (amd64/Darwin) (2017-01-23): 306222424 Presumably creep from recent changes (Typeable?) # (amd64/Darwin) (2017-01-23): 306222424 Presumably creep from recent changes (Typeable?)
# (amd64/Linux) (2017-02-14): 259815560 Early inline patch: 9% improvement
################################### ###################################
# deactivated for now, as this metric became too volatile recently # deactivated for now, as this metric became too volatile recently
...@@ -445,10 +450,11 @@ test('T5631', ...@@ -445,10 +450,11 @@ test('T5631',
test('parsing001', test('parsing001',
[compiler_stats_num_field('bytes allocated', [compiler_stats_num_field('bytes allocated',
[(wordsize(32), 274000576, 10), [(wordsize(32), 274000576, 10),
(wordsize(64), 493730288, 5)]), (wordsize(64), 463931280, 5)]),
# expected value: 587079016 (amd64/Linux) # expected value: 587079016 (amd64/Linux)
# 2016-09-01: 581551384 (amd64/Linux) Restore w/w limit (#11565) # 2016-09-01: 581551384 (amd64/Linux) Restore w/w limit (#11565)
# 2016-12-19: 493730288 (amd64/Linux) Join points (#12988) # 2016-12-19: 493730288 (amd64/Linux) Join points (#12988)
# 2017-02-14: 463931280 Early inlining patch; acutal improvement 7%
only_ways(['normal']), only_ways(['normal']),
], ],
compile_fail, ['']) compile_fail, [''])
...@@ -467,7 +473,7 @@ test('T783', ...@@ -467,7 +473,7 @@ test('T783',
# 2014-12-22: 235002220 (Windows) not sure why # 2014-12-22: 235002220 (Windows) not sure why
# 2016-04-06: 249332816 (x86/Linux, 64-bit machine) # 2016-04-06: 249332816 (x86/Linux, 64-bit machine)
(wordsize(64), 488592288, 10)]), (wordsize(64), 436978192, 10)]),
# prev: 349263216 (amd64/Linux) # prev: 349263216 (amd64/Linux)
# 07/08/2012: 384479856 (amd64/Linux) # 07/08/2012: 384479856 (amd64/Linux)
# 29/08/2012: 436927840 (amd64/Linux) # 29/08/2012: 436927840 (amd64/Linux)
...@@ -496,6 +502,8 @@ test('T783', ...@@ -496,6 +502,8 @@ test('T783',
# (D1535: Major overhaul of pattern match checker, #11162) # (D1535: Major overhaul of pattern match checker, #11162)
# 2016-02-03: 488592288 (amd64/Linux) # 2016-02-03: 488592288 (amd64/Linux)
# (D1795: Another overhaul of pattern match checker, #11374) # (D1795: Another overhaul of pattern match checker, #11374)
# 2017-02-14 436978192 Early inlining: 5% improvement
extra_hc_opts('-static') extra_hc_opts('-static')
], ],
compile,['']) compile,[''])
...@@ -510,7 +518,7 @@ test('T5321Fun', ...@@ -510,7 +518,7 @@ test('T5321Fun',
# 2014-09-03: 299656164 (specialisation and inlining) # 2014-09-03: 299656164 (specialisation and inlining)
# 2014-12-10: 206406188 # Improvements in constraint solver # 2014-12-10: 206406188 # Improvements in constraint solver
# 2016-04-06: 279922360 x86/Linux # 2016-04-06: 279922360 x86/Linux
(wordsize(64), 524706256, 5)]) (wordsize(64), 488295304, 5)])
# prev: 585521080 # prev: 585521080
# 2012-08-29: 713385808 # (increase due to new codegen) # 2012-08-29: 713385808 # (increase due to new codegen)
# 2013-05-15: 628341952 # (reason for decrease unknown) # 2013-05-15: 628341952 # (reason for decrease unknown)
...@@ -535,6 +543,7 @@ test('T5321Fun', ...@@ -535,6 +543,7 @@ test('T5321Fun',
# "drift" reported above # "drift" reported above
# 2017-01-31: 498135752 # Join points (#12988) # 2017-01-31: 498135752 # Join points (#12988)
# 2017-02-23: 524706256 # Type-indexed Typeable? (on Darwin) # 2017-02-23: 524706256 # Type-indexed Typeable? (on Darwin)
# 2017-02-25: 488295304 # Early inlining patch
], ],
compile,['']) compile,[''])
...@@ -617,7 +626,7 @@ test('T5837', ...@@ -617,7 +626,7 @@ test('T5837',
(platform('x86_64-unknown-mingw32'), 59161648, 7), (platform('x86_64-unknown-mingw32'), 59161648, 7),
# 2017-02-19 59161648 (x64/Windows) - Unknown # 2017-02-19 59161648 (x64/Windows) - Unknown
(wordsize(64), 54151864, 7)]) (wordsize(64), 52625920, 7)])
# sample: 3926235424 (amd64/Linux, 15/2/2012) # sample: 3926235424 (amd64/Linux, 15/2/2012)
# 2012-10-02 81879216 # 2012-10-02 81879216
# 2012-09-20 87254264 amd64/Linux # 2012-09-20 87254264 amd64/Linux
...@@ -652,6 +661,7 @@ test('T5837', ...@@ -652,6 +661,7 @@ test('T5837',
# Also bumped acceptance threshold to 7%. # Also bumped acceptance threshold to 7%.
# 2017-02-20 58648600 amd64/Linux Type-indexed Typeable # 2017-02-20 58648600 amd64/Linux Type-indexed Typeable
# 2017-02-28 54151864 amd64/Linux Likely drift due to recent simplifier improvements # 2017-02-28 54151864 amd64/Linux Likely drift due to recent simplifier improvements
# 2017-02-25 52625920 amd64/Linux Early inlining patch
], ],
compile, ['-freduction-depth=50']) compile, ['-freduction-depth=50'])
...@@ -688,7 +698,7 @@ test('T9020', ...@@ -688,7 +698,7 @@ test('T9020',
[(wordsize(32), 343005716, 10), [(wordsize(32), 343005716, 10),
# Original: 381360728 # Original: 381360728
# 2014-07-31: 343005716 (Windows) (general round of updates) # 2014-07-31: 343005716 (Windows) (general round of updates)
(wordsize(64), 764866144, 10)]) (wordsize(64), 500707080, 10)])
# prev: 795469104 # prev: 795469104
# 2014-07-17: 728263536 (general round of updates) # 2014-07-17: 728263536 (general round of updates)
# 2014-09-10: 785871680 post-AMP-cleanup # 2014-09-10: 785871680 post-AMP-cleanup
...@@ -698,6 +708,8 @@ test('T9020', ...@@ -698,6 +708,8 @@ test('T9020',
# 2016-04-06: 852298336 Refactoring of CSE #11781 # 2016-04-06: 852298336 Refactoring of CSE #11781
# 2016-04-06: 698401736 Use thenIO in Applicative IO # 2016-04-06: 698401736 Use thenIO in Applicative IO
# 2017-02-03: 764866144 Join points # 2017-02-03: 764866144 Join points
# 2017-02-14: 500707080 Early inline patch; 35% decrease!
# Program size collapses in first simplification
], ],
compile,['']) compile,[''])
...@@ -750,7 +762,7 @@ test('T9675', ...@@ -750,7 +762,7 @@ test('T9675',
test('T9872a', test('T9872a',
[ only_ways(['normal']), [ only_ways(['normal']),
compiler_stats_num_field('bytes allocated', compiler_stats_num_field('bytes allocated',
[(wordsize(64), 3298422648, 5), [(wordsize(64), 3005891848, 5),
# 2014-12-10 5521332656 Initally created # 2014-12-10 5521332656 Initally created
# 2014-12-16 5848657456 Flattener parameterized over roles # 2014-12-16 5848657456 Flattener parameterized over roles
# 2014-12-18 2680733672 Reduce type families even more eagerly # 2014-12-18 2680733672 Reduce type families even more eagerly
...@@ -758,6 +770,8 @@ test('T9872a', ...@@ -758,6 +770,8 @@ test('T9872a',
# 2016-04-07 3352882080 CSE improvements # 2016-04-07 3352882080 CSE improvements
# 2016-10-19 3134866040 Refactor traceRn interface (#12617) # 2016-10-19 3134866040 Refactor traceRn interface (#12617)
# 2017-02-17 3298422648 Type-indexed Typeable # 2017-02-17 3298422648 Type-indexed Typeable
# 2017-02-25 3005891848 Early inlining patch
(wordsize(32), 1740903516, 5) (wordsize(32), 1740903516, 5)
# was 1325592896 # was 1325592896
# 2016-04-06 1740903516 x86/Linux # 2016-04-06 1740903516 x86/Linux
...@@ -769,7 +783,7 @@ test('T9872a', ...@@ -769,7 +783,7 @@ test('T9872a',
test('T9872b', test('T9872b',
[ only_ways(['normal']), [ only_ways(['normal']),
compiler_stats_num_field('bytes allocated', compiler_stats_num_field('bytes allocated',
[(wordsize(64), 4069522928, 5), [(wordsize(64), 3730686224, 5),
# 2014-12-10 6483306280 Initally created # 2014-12-10 6483306280 Initally created
# 2014-12-16 6892251912 Flattener parameterized over roles # 2014-12-16 6892251912 Flattener parameterized over roles
# 2014-12-18 3480212048 Reduce type families even more eagerly # 2014-12-18 3480212048 Reduce type families even more eagerly
...@@ -777,6 +791,8 @@ test('T9872b', ...@@ -777,6 +791,8 @@ test('T9872b',
# 2016-02-08 4918990352 Improved a bit by tyConRolesRepresentational # 2016-02-08 4918990352 Improved a bit by tyConRolesRepresentational
# 2016-04-06: 4600233488 Refactoring of CSE #11781 # 2016-04-06: 4600233488 Refactoring of CSE #11781
# 2016-09-15: 4069522928 Fix #12422 # 2016-09-15: 4069522928 Fix #12422
# 2017-02-14 3730686224 Early inlining: 5% improvement
(wordsize(32), 2422750696, 5) (wordsize(32), 2422750696, 5)
# was 1700000000 # was 1700000000
# 2016-04-06 2422750696 x86/Linux # 2016-04-06 2422750696 x86/Linux
...@@ -787,7 +803,7 @@ test('T9872b', ...@@ -787,7 +803,7 @@ test('T9872b',
test('T9872c', test('T9872c',
[ only_ways(['normal']), [ only_ways(['normal']),
compiler_stats_num_field('bytes allocated', compiler_stats_num_field('bytes allocated',
[(wordsize(64), 3702580928, 5), [(wordsize(64), 3404346032, 5),
# 2014-12-10 5495850096 Initally created # 2014-12-10 5495850096 Initally created
# 2014-12-16 5842024784 Flattener parameterized over roles # 2014-12-16 5842024784 Flattener parameterized over roles
# 2014-12-18 2963554096 Reduce type families even more eagerly # 2014-12-18 2963554096 Reduce type families even more eagerly
...@@ -795,6 +811,8 @@ test('T9872c', ...@@ -795,6 +811,8 @@ test('T9872c',
# 2016-02-08 4454071184 Improved a bit by tyConRolesRepresentational # 2016-02-08 4454071184 Improved a bit by tyConRolesRepresentational
# 2016-04-06: 4306667256 Refactoring of CSE #11781 # 2016-04-06: 4306667256 Refactoring of CSE #11781
# 2016-09-15: 3702580928 Fixing #12422 # 2016-09-15: 3702580928 Fixing #12422
# 2017-02-14 3404346032 Early inlining: 5% improvement
(wordsize(32), 2257242896, 5) (wordsize(32), 2257242896, 5)
# was 1500000000 # was 1500000000
# 2016-04-06 2257242896 # 2016-04-06 2257242896
...@@ -805,7 +823,7 @@ test('T9872c', ...@@ -805,7 +823,7 @@ test('T9872c',
test('T9872d', test('T9872d',
[ only_ways(['normal']), [ only_ways(['normal']),
compiler_stats_num_field('bytes allocated', compiler_stats_num_field('bytes allocated',
[(wordsize(64), 535565128, 5), [(wordsize(64), 498855104, 5),
# 2014-12-18 796071864 Initally created # 2014-12-18 796071864 Initally created
# 2014-12-18 739189056 Reduce type families even more eagerly # 2014-12-18 739189056 Reduce type families even more eagerly
# 2015-01-07 687562440 TrieMap leaf compression # 2015-01-07 687562440 TrieMap leaf compression
...@@ -816,6 +834,8 @@ test('T9872d', ...@@ -816,6 +834,8 @@ test('T9872d',
# 2016-12-05 478169352 using tyConIsTyFamFree, I think, but only # 2016-12-05 478169352 using tyConIsTyFamFree, I think, but only
# a 1% improvement 482 -> 478 # a 1% improvement 482 -> 478
# 2017-02-17 535565128 Type-indexed Typeable # 2017-02-17 535565128 Type-indexed Typeable
# 2017-02-25 498855104 Early inlining
(wordsize(32), 264566040, 5) (wordsize(32), 264566040, 5)
# some date 328810212 # some date 328810212
# 2015-07-11 350369584 # 2015-07-11 350369584
...@@ -828,7 +848,7 @@ test('T9872d', ...@@ -828,7 +848,7 @@ test('T9872d',
test('T9961', test('T9961',
[ only_ways(['normal']), [ only_ways(['normal']),
compiler_stats_num_field('bytes allocated', compiler_stats_num_field('bytes allocated',
[(wordsize(64), 571246936, 5), [(wordsize(64), 498326216, 5),
# 2015-01-12 807117816 Initally created # 2015-01-12 807117816 Initally created
# 2015-spring 772510192 Got better # 2015-spring 772510192 Got better
# 2015-05-22 663978160 Fix for #10370 improves it more # 2015-05-22 663978160 Fix for #10370 improves it more
...@@ -838,6 +858,8 @@ test('T9961', ...@@ -838,6 +858,8 @@ test('T9961',
# 2016-03-24 568526784 x64_64/Linux Add eqInt* variants (#11688) # 2016-03-24 568526784 x64_64/Linux Add eqInt* variants (#11688)
# 2016-09-01 537297968 x64_64/Linux Restore w/w limit (#11565) # 2016-09-01 537297968 x64_64/Linux Restore w/w limit (#11565)
# 2016-12-19 571246936 x64_64/Linux Join points (#12988) # 2016-12-19 571246936 x64_64/Linux Join points (#12988)
# 2017-02-14 498326216 Early inline patch; 13% improvement
(wordsize(32), 275264188, 5) (wordsize(32), 275264188, 5)
# was 375647160 # was 375647160
# 2016-04-06 275264188 x86/Linux # 2016-04-06 275264188 x86/Linux
...@@ -872,7 +894,7 @@ test('T9233', ...@@ -872,7 +894,7 @@ test('T9233',
test('T10370', test('T10370',
[ only_ways(['optasm']), [ only_ways(['optasm']),
compiler_stats_num_field('max_bytes_used', # Note [residency] compiler_stats_num_field('max_bytes_used', # Note [residency]
[(wordsize(64), 51126304, 15), [(wordsize(64), 41291976, 15),
# 2015-10-22 19548720 # 2015-10-22 19548720
# 2016-02-24 22823976 Changing Levity to RuntimeRep; not sure why this regresses though, even after some analysis # 2016-02-24 22823976 Changing Levity to RuntimeRep; not sure why this regresses though, even after some analysis
# 2016-04-14 28256896 final demand analyzer run # 2016-04-14 28256896 final demand analyzer run
...@@ -887,16 +909,19 @@ test('T10370', ...@@ -887,16 +909,19 @@ test('T10370',
# See the comment 16 on #8472. # See the comment 16 on #8472.
# 2017-02-17 51126304 Type-indexed Typeawble # 2017-02-17 51126304 Type-indexed Typeawble
# 2017-02-27 43455848 Likely drift from recent simplifier improvements # 2017-02-27 43455848 Likely drift from recent simplifier improvements
# 2017-02-25 41291976 Early inline patch
(wordsize(32), 11371496, 15), (wordsize(32), 11371496, 15),
# 2015-10-22 11371496 # 2015-10-22 11371496
]), ]),
compiler_stats_num_field('peak_megabytes_allocated', # Note [residency] compiler_stats_num_field('peak_megabytes_allocated', # Note [residency]
[(wordsize(64), 187, 15), [(wordsize(64), 154, 15),
# 2015-10-22 76 # 2015-10-22 76
# 2016-04-14 101 final demand analyzer run # 2016-04-14 101 final demand analyzer run
# 2016-08-08 121 see above # 2016-08-08 121 see above
# 2017-01-18 146 Allow top-level string literals in Core # 2017-01-18 146 Allow top-level string literals in Core