simplifier: Kill off ufKeenessFactor
We used to have another factor, ufKeenessFactor
, which would scale the
discounts before they were subtracted from the size. This was justified with
the following comment:
-- We multiple the raw discounts (args_discount and result_discount)
-- ty opt_UnfoldingKeenessFactor because the former have to do with
-- *size* whereas the discounts imply that there's some extra
-- *efficiency* to be gained (e.g. beta reductions, case reductions)
-- by inlining.
However, this is highly suspect since it means that we subtract a scaled size
from an absolute size, resulting in crazy (e.g. negative) scores in some cases
(#15304 (closed)). We consequently killed off ufKeenessFactor
and bumped up the
ufUseThreshold
to compensate.
Measurements
Since this removes a discount from our inlining heuristic, I revisited our
default choice of -funfolding-use-threshold
to minimize the change in
overall inlining behavior. Specifically, I measured runtime allocations and executable size of nofib
and the testsuite
performance tests built using compilers (and core libraries) built with
several values of -funfolding-use-threshold
.
For each of these configurations I show a histogram of relative changes for each
test as well as a table of sufficient statistics. The value of -funfolding-use-threshold
for each histogram can be found in its title.
In light of these measurements I settled on a new value of 80.
Nofib
Compile-time allocations
gmean | min | max | median | |
---|---|---|---|---|
thresh | ||||
50 | 1.009859 | 0.535136 | 1.063890 | 1.016143 |
60 | 1.011910 | 0.538791 | 1.059250 | 1.017167 |
70 | 1.003993 | 0.532537 | 1.115937 | 1.008415 |
80 | 1.000903 | 0.852942 | 1.108481 | 1.000142 |
90 | 1.002691 | 0.904027 | 1.239811 | 1.000483 |
100 | 0.984152 | 0.891669 | 1.244622 | 0.982516 |
baseline | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
Runtime allocations
gmean | min | max | median | |
---|---|---|---|---|
thresh | ||||
50 | 1.001580 | 0.998387 | 1.044254 | 1.000001 |
60 | 1.000943 | 0.999963 | 1.030995 | 1.000001 |
70 | 1.000398 | 0.999110 | 1.022908 | 1.000000 |
80 | 1.000167 | 0.988344 | 1.022907 | 1.000000 |
90 | 0.999800 | 0.974135 | 1.018649 | 1.000000 |
100 | 1.000048 | 0.974126 | 1.075062 | 1.000000 |
baseline | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
Executable size
gmean | min | max | median | |
---|---|---|---|---|
thresh | ||||
50 | 0.936441 | 0.929623 | 0.951843 | 0.935396 |
60 | 0.949585 | 0.940280 | 0.961669 | 0.948934 |
70 | 0.970964 | 0.961623 | 0.976944 | 0.970822 |
80 | 0.992470 | 0.978397 | 0.995790 | 0.992652 |
90 | 1.002443 | 0.995920 | 1.005479 | 1.002560 |
100 | 1.013584 | 1.007997 | 1.016386 | 1.013737 |
baseline | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
Testsuite
Compile-time allocations
gmean | min | max | median | |
---|---|---|---|---|
thresh | ||||
50 | 1.061249 | 0.918177 | 1.673109 | 1.029108 |
60 | 1.038988 | 0.916109 | 1.438926 | 1.028437 |
70 | 1.027486 | 0.933324 | 1.457676 | 1.015059 |
80 | 1.017348 | 0.940708 | 1.084090 | 1.012769 |
90 | 1.016218 | 0.929523 | 1.082438 | 1.011991 |
100 | 0.995357 | 0.890169 | 1.046244 | 0.999866 |
baseline | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
Runtime allocations
gmean | min | max | median | |
---|---|---|---|---|
thresh | ||||
50 | 1.172722 | 1.000000 | 5276.500195 | 1.000147 |
60 | 1.170927 | 1.000000 | 5276.495115 | 1.000000 |
70 | 1.170634 | 1.000000 | 5276.495115 | 1.000000 |
80 | 1.170648 | 1.000000 | 5276.495115 | 1.000000 |
90 | 1.166662 | 0.930010 | 4690.329230 | 1.000000 |
100 | 0.994455 | 0.857952 | 1.002668 | 1.000000 |
baseline | 1.000000 | 1.000000 | 1.000000 | 1.000000 |