Skip to content

Consider CmmFloat - Code motion to move shared subexpressions into the common path.

Consider this motivating code snippet:

maybeFlipCond :: Cond -> Maybe Cond
maybeFlipCond cond  = case cond of
        EQQ   -> Just EQQ
        NE    -> Just NE
        ...
        LE    -> Just GE
        GE    -> Just LE
        _other -> Nothing

This results in the following Cmm code, with some unrelated parts removed:

       c2uc: // global
           I64[Sp - 8] = c2tU;
           R1 = R2;
           Sp = Sp - 8;
           if (R1 & 7 != 0) goto c2tU; else goto c2tV;
       c2tV: // global
           call (I64[R1])(R1) returns to c2tU, args: 8, res: 8, upd: 8;
       c2tU: // global
           _c2u9::I64 = %MO_UU_Conv_W32_W64(I32[I64[R1 - 1] - 4]);
           if (_c2u9::I64 >= 11) goto c2tY; else goto u2uK;
       u2uK: // global
           if (_c2u9::I64 < 1) goto c2tY; else goto u2uL;
       c2tY: // global
           R1 = Nothing_closure+1;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
       u2uL: // global
           switch [1 .. 10] _c2u9::I64 {
               case 1 : goto c2tZ;
               case 2 : goto c2u0;
               case 3 : goto c2u1;
               case 4 : goto c2u2;
               case 5 : goto c2u3;
               case 6 : goto c2u4;
               case 7 : goto c2u5;
               case 8 : goto c2u6;
               case 9 : goto c2u7;
               case 10 : goto c2u8;
           }
       c2u8: // global
           R1 = maybeFlipCond1_closure+2;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
       c2u7: // global
           R1 = maybeFlipCond2_closure+2;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;


       ...: // global
           repeat for maybeFlipCond [3..9] _closure

           
       c2tZ: // global
           R1 = maybeFlipCond10_closure+2;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
     }

In this case there is no reason why the SP modification couldn't be pulled out into the common path.

In this case this would have the benefits of:

  • Reducing code size
  • It might improve latency and therefore performance between the SP modification and the indirect jump to SP (the call).

As a knock-on effect we could also transform the jumptable into a lookup table improving performance further. (See #17238)

Now the prime use case for this would be SP modifications as this pattern is somewhat common. But it could easily be generalized into a CmmFloat pass which does this with arbitrary shared expressions.

Edited by Andreas Klebinger
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information