Consider CmmFloat - Code motion to move shared subexpressions into the common path.

Consider this motivating code snippet:

maybeFlipCond :: Cond -> Maybe Cond
maybeFlipCond cond  = case cond of
        EQQ   -> Just EQQ
        NE    -> Just NE
        ...
        LE    -> Just GE
        GE    -> Just LE
        _other -> Nothing

This results in the following Cmm code, with some unrelated parts removed:

       c2uc: // global
           I64[Sp - 8] = c2tU;
           R1 = R2;
           Sp = Sp - 8;
           if (R1 & 7 != 0) goto c2tU; else goto c2tV;
       c2tV: // global
           call (I64[R1])(R1) returns to c2tU, args: 8, res: 8, upd: 8;
       c2tU: // global
           _c2u9::I64 = %MO_UU_Conv_W32_W64(I32[I64[R1 - 1] - 4]);
           if (_c2u9::I64 >= 11) goto c2tY; else goto u2uK;
       u2uK: // global
           if (_c2u9::I64 < 1) goto c2tY; else goto u2uL;
       c2tY: // global
           R1 = Nothing_closure+1;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
       u2uL: // global
           switch [1 .. 10] _c2u9::I64 {
               case 1 : goto c2tZ;
               case 2 : goto c2u0;
               case 3 : goto c2u1;
               case 4 : goto c2u2;
               case 5 : goto c2u3;
               case 6 : goto c2u4;
               case 7 : goto c2u5;
               case 8 : goto c2u6;
               case 9 : goto c2u7;
               case 10 : goto c2u8;
           }
       c2u8: // global
           R1 = maybeFlipCond1_closure+2;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
       c2u7: // global
           R1 = maybeFlipCond2_closure+2;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;


       ...: // global
           repeat for maybeFlipCond [3..9] _closure

           
       c2tZ: // global
           R1 = maybeFlipCond10_closure+2;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
     }

In this case there is no reason why the SP modification couldn't be pulled out into the common path.

In this case this would have the benefits of:

Reducing code size
It might improve latency and therefore performance between the SP modification and the indirect jump to SP (the call).

As a knock-on effect we could also transform the jumptable into a lookup table improving performance further. (See #17238)

Now the prime use case for this would be SP modifications as this pattern is somewhat common. But it could easily be generalized into a CmmFloat pass which does this with arbitrary shared expressions.

Edited Sep 24, 2019 by Andreas Klebinger

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message

Consider CmmFloat - Code motion to move shared subexpressions into the common path.