• Joachim Breitner's avatar
    Refactor the story around switches (#10137) · de1160be
    Joachim Breitner authored
    This re-implements the code generation for case expressions at the Stg →
    Cmm level, both for data type cases as well as for integral literal
    cases. (Cases on float are still treated as before).
    The goal is to allow for fancier strategies in implementing them, for a
    cleaner separation of the strategy from the gritty details of Cmm, and
    to run this later than the Common Block Optimization, allowing for one
    way to attack #10124. The new module CmmSwitch contains a number of
    notes explaining this changes. For example, it creates larger
    consecutive jump tables than the previous code, if possible.
    nofib shows little significant overall improvement of runtime. The
    rather large wobbling comes from changes in the code block order
    (see #8082, not much we can do about it). But the decrease in code size
    alone makes this worthwhile.
            Program           Size    Allocs   Runtime   Elapsed  TotalMem
                Min          -1.8%      0.0%     -6.1%     -6.1%     -2.9%
                Max          -0.7%     +0.0%     +5.6%     +5.7%     +7.8%
     Geometric Mean          -1.4%     -0.0%     -0.3%     -0.3%     +0.0%
    Compilation time increases slightly:
            -1 s.d.                -----            -2.0%
            +1 s.d.                -----            +2.5%
            Average                -----            +0.3%
    The test case T783 regresses a lot, but it is the only one exhibiting
    any regression. The cause is the changed order of branches in an
    if-then-else tree, which makes the hoople data flow analysis traverse
    the blocks in a suboptimal order. Reverting that gets rid of this
    regression, but has a consistent, if only very small (+0.2%), negative
    effect on runtime. So I conclude that this test is an extreme outlier
    and no reason to change the code.
    Differential Revision: https://phabricator.haskell.org/D720
all.T 6.23 KB