We should combine a sequence of 32bit load, 32bit move into a single 32bit instruction.
Currently when we compile R2 = %MO_XX_Conv_W32_W64(I32[Sp + 8]);
we get a sequence of:
movl 8(%rbp),%eax
movl %eax,%r14d ; eax not used any further
There is no reason to have the extra move there as far as I can tell. This should just be movl 8(%rbp),%r14d
.
Might be hard to avoid producing code like this initially but seems like something the register allocator should be able to fix up.