Skip to content

unnecessary indirect jump when returning a case scrutinee

I happened to be looking at the Cmm for this code (ghc 7.8.3, -O2)

f :: Int -> Int
f x = if x < 0 then x else x+1

and I noticed something a bit funny about it:

       c12e:
           if ((Sp + -8) < SpLim) goto c12z; else goto c12A;
       c12z:
           R2 = R2;
           R1 = Test.f_closure;
           call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
       c12A:
           I64[Sp - 8] = c12b;
           R1 = R2;
           Sp = Sp - 8;
           if (R1 & 7 != 0) goto c12b; else goto c12c;
       c12c:
           call (I64[R1])(R1) returns to c12b, args: 8, res: 8, upd: 8;
       c12b:
           Hp = Hp + 16;
           if (Hp > HpLim) goto c12y; else goto c12x;
       c12y:
           HpAlloc = 16;
           R1 = R1;
           call stg_gc_unpt_r1(R1) returns to c12b, args: 8, res: 8, upd: 8;
       c12x:
           _s11Q::I64 = I64[R1 + 7];
           if (%MO_S_Lt_W64(_s11Q::I64, 0)) goto c12u; else goto c12v;
       c12u:
           Hp = Hp - 16;
           R1 = R1 & (-8);                              /* <--- */
           Sp = Sp + 8;
           call (I64[R1])(R1) args: 8, res: 0, upd: 8;  /* <--- */
       c12v:
           I64[Hp - 8] = GHC.Types.I#_con_info;
           I64[Hp] = _s11Q::I64 + 1;
           R1 = Hp - 7;
           Sp = Sp + 8;
           call (P64[Sp])(R1) args: 8, res: 0, upd: 8;

On the two marked lines, we untag R1 (which is x) and enter it. However, we know at this point that x is already in WHNF so we could simply return it by replacing the two lines with call (P64[Sp])(R1), if I'm not mistaken. That will save a load and an indirect jump (which we actually know is to I#_con_info, which would just retag R1 and return to the address on the stack anyways).

I think the same optimization should be available any time we do an algebraic case and in a branch simply return the scrutinee.

I looked at what it would take to fix this. It looks almost easy: if we add a new LambdaFormInfo constructor LFUnknownCon meaning that we know the identifier is bound to a saturated application of an unknown constructor, then we could set the cg_lf of the case binder variable of an algebraic case statement to LFUnknownCon, and return ReturnIt for LFUnknownCon variables in getCallMethod. I think that would do it. Does that sound right? Is there a better way?

(In my original example we actually know the constructor has to be I#. But if the case was on a type with more than one constructor we wouldn't know statically which one we got, just that it has to be one of them.)

Trac metadata
Trac field Value
Version 7.8.3
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler (CodeGen)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC simonmar
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information