Suboptimal code produced for simple Cmm expression

The following Cmm code (produced while doing some experiments with some new optimization for case expressions), is compiled into the assembly below:

Cmm code:

==================== Post switch plan ====================
{offset
  c1Ho: // global
      _s1FE::P64 = R2;
      if ((old + 0) - <highSp> < SpLim) (likely: False) goto c1Hp; else goto c1Hq;
  c1Hp: // global
      R2 = _s1FE::P64;
      R1 = Main.$wgggg2_closure;
      call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
  c1Hq: // global
      I64[(young<c1Hg> + 8)] = c1Hg;
      R1 = _s1FE::P64;
      if (R1 & 7 != 0) goto c1Hg; else goto c1Hh;
  c1Hh: // global
      call (I64[R1])(R1) returns to c1Hg, args: 8, res: 8, upd: 8;
  c1Hg: // global
      _s1FF::P64 = R1;
      _c1Hn::P64 = _s1FF::P64 & 7;
      _u1HA::I64 = ((528 >> _c1Hn::P64) >> _c1Hn::P64) & 3;
      if (_u1HA::I64 == 1) goto c1Hl; else goto u1HB;
  c1Hl: // global
      R1 = 33;
      call (P64[(old + 8)])(R1) args: 8, res: 0, upd: 8;
  u1HB: // global
      if (_u1HA::I64 == 2) goto c1Hm; else goto c1Hk;
  c1Hm: // global
      R1 = 440;
      call (P64[(old + 8)])(R1) args: 8, res: 0, upd: 8;
  c1Hk: // global
      R1 = 44;
      call (P64[(old + 8)])(R1) args: 8, res: 0, upd: 8;
}

Of interest is how the block starting at label C1Hg is compiled. We get the following code:

===========================================
movl $528,%eax
movq %rbx,%rcx   <-- (***)
shrq %cl,%rax
movq %rbx,%rcx   <---(***)
shrq %cl,%rax
andl $3,%eax
cmpq $1,%rax
je .Lc1Hl
===========================================

At issue is the duplicated instruction marked by (***). Why is it repeated? Looks unnecessary. If we look at the assembly produced by gcc for the following C program:

unsigned long __attribute__((noinline)) h(unsigned long x) {
  return ((528ul >> x) >> x) & 3ul;
}

we get:

movl	$528, %eax
shrq	%cl, %rax
shrq	%cl, %rax
andl	$3, %eax

which looks optimal.

Edited Feb 26, 2021 by Neo

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information