Commit a48de37d authored by Sergei Trofimovich's avatar Sergei Trofimovich Committed by Sergei Trofimovich
Browse files

restore -fmax-worker-args handling (Trac #11565)

maxWorkerArgs handling was accidentally lost 3 years ago
in a major update of demand analysis
    commit 0831a12e



Old regression is noticeable as:
- code bloat (requires stack reshuffling)
- compilation slowdown (more code to optimise/generate)
- and increased heap usage (DynFlags unboxing/reboxing?)

On a simple compile benchmark this change causes heap
allocation drop from 70G don to 67G (ghc perf build).
Signed-off-by: default avatarSergei Trofimovich <siarheit@google.com>

Reviewers: simonpj, ezyang, goldfire, austin, bgamari

Reviewed By: simonpj, ezyang

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2503

GHC Trac Issues: #11565
parent da920f69
......@@ -143,7 +143,7 @@ mkWwBodies dflags fam_envs fun_ty demands res_info
wrapper_body = wrap_fn_args . wrap_fn_cpr . wrap_fn_str . applyToVars work_call_args . Var
worker_body = mkLams work_lam_args. work_fn_str . work_fn_cpr . work_fn_args
; if useful1 && not only_one_void_argument || useful2
; if is_small_enough work_args && (useful1 && not only_one_void_argument || useful2)
then return (Just (worker_args_dmds, wrapper_body, worker_body))
else return Nothing
}
......@@ -163,6 +163,10 @@ mkWwBodies dflags fam_envs fun_ty demands res_info
= True
| otherwise
= False
is_small_enough args = count isId args <= maxWorkerArgs dflags
-- See Note [Limit w/w arity]
-- We count only Free variables (isId) to skip Type, Kind
-- variables which have no runtime representation.
{-
Note [Always do CPR w/w]
......@@ -177,6 +181,30 @@ a disaster, because then the enclosing function might say it has the CPR
property, but now doesn't and there a cascade of disaster. A good example
is Trac #5920.
Note [Limit w/w arity]
~~~~~~~~~~~~~~~~~~~~~~~~
Guard against high worker arity as it generates a lot of stack traffic.
A simplified example is Trac #11565#comment:6
Current strategy is very simple: don't perform w/w transformation at all
if the result produces a wrapper with arity higher than -fmax-worker-args=.
It is a bit all or nothing, consider
f (x,y) (a,b,c,d,e ... , z) = rhs
Currently we will remove all w/w ness entirely. But actually we could
w/w on the (x,y) pair... it's the huge product that is the problem.
Could we instead refrain from w/w on an arg-by-arg basis? Yes, that'd
solve f. But we can get a lot of args from deeply-nested products:
g (a, (b, (c, (d, ...)))) = rhs
This is harder to spot on an arg-by-arg basis. Previously mkWwStr was
given some "fuel" saying how many arguments it could add; when we ran
out of fuel it would stop w/wing.
Still not very clever because it had a left-right bias.
************************************************************************
* *
......
......@@ -428,8 +428,9 @@ test('T5631',
test('parsing001',
[compiler_stats_num_field('bytes allocated',
[(wordsize(32), 274000576, 10),
(wordsize(64), 587079016, 5)]),
(wordsize(64), 581551384, 5)]),
# expected value: 587079016 (amd64/Linux)
# 2016-09-01: 581551384 (amd64/Linux) Restore w/w limit (#11565)
only_ways(['normal']),
],
compile_fail, [''])
......@@ -767,7 +768,7 @@ test('T9872d',
test('T9961',
[ only_ways(['normal']),
compiler_stats_num_field('bytes allocated',
[(wordsize(64), 568526784, 5),
[(wordsize(64), 537297968, 5),
# 2015-01-12 807117816 Initally created
# 2015-spring 772510192 Got better
# 2015-05-22 663978160 Fix for #10370 improves it more
......@@ -775,6 +776,7 @@ test('T9961',
# 2015-12-17 745044392 x86_64/Darwin Creep upwards
# 2016-03-20 519436672 x64_64/Linux Don't use build desugaring for large lists (#11707)
# 2016-03-24 568526784 x64_64/Linux Add eqInt* variants (#11688)
# 2016-09-01 537297968 x64_64/Linux Restore w/w limit (#11565)
(wordsize(32), 275264188, 5)
# was 375647160
# 2016-04-06 275264188 x86/Linux
......
......@@ -58,17 +58,19 @@ test('T4018',
test('T4029',
[stats_num_field('peak_megabytes_allocated',
[(wordsize(64), 82, 10)]),
[(wordsize(64), 71, 10)]),
# 2016-02-26: 66 (amd64/Linux) INITIAL
# 2016-05-23: 82 (amd64/Linux) Use -G1
# 2016-07-13: 92 (amd64/Linux) Changes to tidyType
# 2016-09-01: 71 (amd64/Linux) Restore w/w limit (#11565)
stats_num_field('max_bytes_used',
[(wordsize(64), 22920616, 5)]),
[(wordsize(64), 21648488, 5)]),
# 2016-02-26: 24071720 (amd64/Linux) INITIAL
# 2016-04-21: 25542832 (amd64/Linux)
# 2016-05-23: 25247216 (amd64/Linux) Use -G1
# 2016-07-13: 27575416 (amd64/Linux) Changes to tidyType
# 2016-07-20: 22920616 (amd64/Linux) Fix laziness of instance matching
# 2016-09-01: 21648488 (amd64/Linux) Restore w/w limit (#11565)
extra_hc_opts('+RTS -G1 -RTS' ),
],
ghci_script,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment