Missed opt.: Fold arithmetic into LEA
I noticed a few cases, where NCG could do better on x86, folding arithmetic into a single LEA instruction:
-- Unboxed makes it easier to see:
foo :: Int# -> Int#
foo x = 42# +# 2# *# x
-- Cmm: R1 = (R2 << 1) + 42;
-- ASM: shlq $1,%r14
leaq 42(%r14),%rbx
--
-- Could be: leaq 42(%r14,%r14), %rbx
-- Same but additional load from memory
foo2 :: Int -> Int
foo2 x = 42 + 2*x
-- Cmm: _sAY::I64 = (I64[R1 + 7] << 1) + 42;
-- ASM: movq 7(%rbx),%rax
-- shlq $1,%rax
-- addq $42,%rax
--
-- Could be: movq 7(%rbx),%rcx
leaq 42(%rcx,%rcx), %rax
-- LEA can do "base + scale * index + displacement":
foo3 :: Int# -> Int# -> Int#
foo3 x y = x +# y *# 4# -# 4#
-- Cmm: R1 = R2 + (R3 << 2) - 4;
-- ASM: shlq $2,%rsi
-- addq %rsi,%r14
-- leaq -4(%r14),%rbx
--
-- Could be: leaq -4(%r14,%rsi,4), %rbx
I tried it out bc. I saw a demonstration of GCC doing this.
For the first use, LEA is already used, so a simple rule for merging SHIFTL and ADD should work.
Just leaving this issue here as a reminder, when I or someone else has too much time on their hand :)