Incorrect handling of subword Clz/Ctz (or other CallishMachOp) in wasm backend
When debugging #22462 (closed), I noticed that the wasm backend may not be correctly lowering MO_Clz
/MO_Ctz
or other CallishMachOp
for subwords. Consider the assembly of hs_clz8
:
hs_clz8: # @hs_clz8
.functype hs_clz8 (i32) -> (i32)
# %bb.0:
local.get 0
i32.const 255
i32.and
local.tee 0
i32.clz
i32.const -24
i32.add
i32.const 8
local.get 0
i32.select
# fallthrough-return
end_function
But we're only emitting a single i32.clz
instruction, which only work for MO_Clz
on 32-bit integers. And the similar problem may very well be present in other CallishMachOp
s. These bugs were not caught by godbolt.hs
because godbolt.hs
lacked the ability to inspect assembly code of CallishMachOp
s, which exist in ghc-prim
C code.
It's necessary to manually go over all assembly code output of ghc-prim
cbits, and change our inline lowering logic to ccalls when appropriate.