More accurate unboxing (0a82ae0d) · Commits · Reinier Maas / GHC

Commit 0a82ae0d authored 3 years ago by Simon Peyton Jones Committed by Marge Bot 3 years ago

More accurate unboxing

This patch implements a fix for #20817.  It ensures that

* The final strictness signature for a function accurately
  reflects the unboxing done by the wrapper
  See Note [Finalising boxity for demand signatures]
  and Note [Finalising boxity for let-bound Ids]

* A much better "layer-at-a-time" implementation of the
  budget for how many worker arguments we can have
  See Note [Worker argument budget]

  Generally this leads to a bit more worker/wrapper generation,
  because instead of aborting entirely if the budget is exceeded
  (and then lying about boxity), we unbox a bit.

Binary sizes in increase slightly (around 1.8%) because of the increase
in worker/wrapper generation.  The big effects are to GHC.Ix,
GHC.Show, GHC.IO.Handle.Internals. If we did a better job of dropping
dead code, this effect might go away.

Some nofib perf improvements:

        Program           Size    Allocs   Runtime   Elapsed  TotalMem
--------------------------------------------------------------------------------
            VSD          +1.8%     -0.5%     0.017     0.017      0.0%
         awards          +1.8%     -0.1%     +2.3%     +2.3%      0.0%
         banner          +1.7%     -0.2%     +0.3%     +0.3%      0.0%
           bspt          +1.8%     -0.1%     +3.1%     +3.1%      0.0%
          eliza          +1.8%     -0.1%     +1.2%     +1.2%      0.0%
         expert          +1.7%     -0.1%     +9.6%     +9.6%      0.0%
 fannkuch-redux          +1.8%     -0.4%     -9.3%     -9.3%      0.0%
          kahan          +1.8%     -0.1%    +22.7%    +22.7%      0.0%
       maillist          +1.8%     -0.9%    +21.2%    +21.6%      0.0%
       nucleic2          +1.7%     -5.1%     +7.5%     +7.6%      0.0%
         pretty          +1.8%     -0.2%     0.000     0.000      0.0%
reverse-complem          +1.8%     -2.5%    +12.2%    +12.2%      0.0%
           rfib          +1.8%     -0.2%     +2.5%     +2.5%      0.0%
            scc          +1.8%     -0.4%     0.000     0.000      0.0%
         simple          +1.7%     -1.3%    +17.0%    +17.0%     +7.4%
  spectral-norm          +1.8%     -0.1%     +6.8%     +6.7%      0.0%
         sphere          +1.7%     -2.0%    +13.3%    +13.3%      0.0%
            tak          +1.8%     -0.2%     +3.3%     +3.3%      0.0%
           x2n1          +1.8%     -0.4%     +8.1%     +8.1%      0.0%
--------------------------------------------------------------------------------
            Min          +1.1%     -5.1%    -23.6%    -23.6%      0.0%
            Max          +1.8%     +0.0%    +36.2%    +36.2%     +7.4%
 Geometric Mean          +1.7%     -0.1%     +6.8%     +6.8%     +0.1%

Compiler allocations in CI have a geometric mean of +0.1%; many small
decreases but there are three bigger increases (7%), all because we do
more worker/wrapper than before, so there is simply more code to
compile.  That's OK.

Perf benchmarks in perf/should_run improve in allocation by a geo mean
of -0.2%, which is good.  None get worse. T12996 improves by -5.8%

Metric Decrease:
    T12996
Metric Increase:
    T18282
    T18923
    T9630

parent fbc77d3a

No related branches found

No related tags found

No related merge requests found

Hide whitespace changes

Inline Side-by-side

Showing with 823 additions and 384 deletions

Please register or to comment