AArch64 ncg: `signExtendReg` and `truncateReg` seem suspicious.

This is half a note to myself, half a call to fix/improve the situation around these. I haven't (yet) found anything going wrong with these but if nothing else it's confusing.

Sign extend looks like that:

-- | Instructions to sign-extend the value in the given register from width @w@
-- up to width @w'@.
signExtendReg :: Width -> Width -> Reg -> NatM (Reg, OrdList Instr)
signExtendReg w w' r =
    case w of
      W64 -> noop
      W32
        | w' == W32 -> noop
        | otherwise -> extend SXTH
      W16           -> extend SXTH
      W8            -> extend SXTB
      _             -> panic "intOp"
  where
    noop = return (r, nilOL)
    extend instr = do
        r' <- getNewRegNat II64
        return (r', unitOL $ instr (OpReg w' r') (OpReg w' r))

What does up to width w' really mean here? Looking at SXTB for example:

Signed Extend Byte extracts an 8-bit value from a register, sign-extends it to the size of the register, and writes the result to the destination register.

But if we have signExtendReg r W8 W16 we get SXTB w0 w0 which will sign extend w0 up to 32 bits. At best this is confusing at worst this is a bug in waiting. But it's probably fine to just change the docs to say:

Instructions to sign-extend the value in the given register from width @w@ up to width @opRegWidth w'@.

And maybe mention that it doesn't uphold the invariant layed out in Note [Signed arithmetic on AArch64]

... For simplicity we maintain the invariant that a register containing a sub-word-size value always contains the zero-extended form of that value in between operations. ...

truncateReg has different and more correctness relevant issue.


-- | Instructions to truncate the value in the given register from width @w@
-- down to width @w'@.
truncateReg :: Width -> Width -> Reg -> OrdList Instr
truncateReg w w' r =
    case w of
      W64 -> nilOL
      W32
        | w' == W32 -> nilOL
      _   -> unitOL $ UBFM (OpReg w r)
                           (OpReg w r)
                           (OpImm (ImmInt 0))
                           (OpImm $ ImmInt $ widthInBits w' - 1)

If we have x0 = 0xff...ff at 64bit width and we try to truncate to W8 to uphold the invariant for subwords we simply do nothing leaving the high bits as is.

That just seems wrong. Luckily I believe currently we will always truncate from 32bit if we deal with subwords so things still work out. But I haven't verified this and it's definitely a bug in waiting.

Edited Aug 07, 2023 by Andreas Klebinger

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information