Skip to content

Suboptimal code produced for simple Cmm expression

The following Cmm code (produced while doing some experiments with some new optimization for case expressions), is compiled into the assembly below:

Cmm code:

==================== Post switch plan ====================
{offset
  c1Ho: // global
      _s1FE::P64 = R2;
      if ((old + 0) - <highSp> < SpLim) (likely: False) goto c1Hp; else goto c1Hq;
  c1Hp: // global
      R2 = _s1FE::P64;
      R1 = Main.$wgggg2_closure;
      call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
  c1Hq: // global
      I64[(young<c1Hg> + 8)] = c1Hg;
      R1 = _s1FE::P64;
      if (R1 & 7 != 0) goto c1Hg; else goto c1Hh;
  c1Hh: // global
      call (I64[R1])(R1) returns to c1Hg, args: 8, res: 8, upd: 8;
  c1Hg: // global
      _s1FF::P64 = R1;
      _c1Hn::P64 = _s1FF::P64 & 7;
      _u1HA::I64 = ((528 >> _c1Hn::P64) >> _c1Hn::P64) & 3;
      if (_u1HA::I64 == 1) goto c1Hl; else goto u1HB;
  c1Hl: // global
      R1 = 33;
      call (P64[(old + 8)])(R1) args: 8, res: 0, upd: 8;
  u1HB: // global
      if (_u1HA::I64 == 2) goto c1Hm; else goto c1Hk;
  c1Hm: // global
      R1 = 440;
      call (P64[(old + 8)])(R1) args: 8, res: 0, upd: 8;
  c1Hk: // global
      R1 = 44;
      call (P64[(old + 8)])(R1) args: 8, res: 0, upd: 8;
}

Of interest is how the block starting at label C1Hg is compiled. We get the following code:

===========================================
movl $528,%eax
movq %rbx,%rcx   <-- (***)
shrq %cl,%rax
movq %rbx,%rcx   <---(***)
shrq %cl,%rax
andl $3,%eax
cmpq $1,%rax
je .Lc1Hl
===========================================

At issue is the duplicated instruction marked by (***). Why is it repeated? Looks unnecessary. If we look at the assembly produced by gcc for the following C program:

unsigned long __attribute__((noinline)) h(unsigned long x) {
  return ((528ul >> x) >> x) & 3ul;
}

we get:

movl	$528, %eax
shrq	%cl, %rax
shrq	%cl, %rax
andl	$3, %eax

which looks optimal.

Edited by Neo
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information