AArch64 backend should handle code generation for bitmasks more efficiently

Currently if one attempts to compile the program

#include "Cmm.h"

hi() {
  R1 = UNTAG(R1);
  jump cont();
}

with the ARMv8 NCG you get the code:

hi:
        mov x18, #65528
        movk x18, #65535, lsl #16
        movk x18, #65535, lsl #32
        movk x18, #65535, lsl #48
        and x22, x22, x18
        b cont

This is horrible, using four move instructions and an and where a single and would do. for instance, the LLVM backend produces the following:

Disassembly of section .text:

0000000000000000 <hi>:
   0:   927df2d6        and     x22, x22, #0xfffffffffffffff8
   4:   94000000        bl      0 <cont>
   8:   d65f03c0        ret

This is possible because AArch64's 12-bit immediate encoding is sign-extended.

However, even using movn and and would be a considerable improvement over the status quo.

Edited Mar 08, 2023 by Ben Gamari

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information