Cmm's lack of signedness information is problematic for some platforms

Currently Cmm has a very simple model of data types:

data CmmType
  = CmmType CmmCat Width

data CmmCat                -- "Category" (not exported)
   = GcPtrCat              -- GC pointer
   | BitsCat               -- Non-pointer
   | FloatCat              -- Float
   | VecCat Length CmmCat  -- Vector

Most things are either GC pointers (corresponding to Haskell's BoxedRep) or "bits" (corresponding to the majority of the remaining RuntimeReps, including IntRep, Int32Rep, etc.). Local registers, literals, and other Cmm expressions are annotated with CmmTypes.

Machine operations (MachOps) are generally polymorphic over the width of its operand. For concreteness, here are the "types" of a few MachOps in a pseudo-Haskell syntax (see https://gitlab.haskell.org/bgamari/test-primops/-/blob/master/src/Expr.hs#L77 for a few more examples):

MO_Add     :: forall (width :: Width). CmmExpr width -> CmmExpr width -> CmmExpr width
MO_S_Mul   :: forall (width :: Width). CmmExpr width -> CmmExpr width -> CmmExpr width
MO_S_Gt    :: forall (width :: Width). CmmExpr width -> CmmExpr width -> CmmExpr WordSize

The problems come when we start thinking about signed versus unsigned arithmetic. For instance, consider this Cmm program (from #20644 (closed)) running on a 64-bit machine:

test(bits64 buffer)
  bits8 a = bits8[buffer];
  bits8 b = %quot(a, 2::bits8);
  return (b);
}

Here we load an 8-bit number, a, from a buffer. We then perform a signed division-by-2 on a and return the result. Assuming the target machine has an 8-bit division operation this program is easy to compile. However, if we do not have sub-word arithmetic (like, e.g., AArch64) then things get much hard as we need to start worrying about how we filled the high bits of the value: we must ensure that the value is sign-extended before we perform the division. Under a naive implementation this means that before every arithmetic operation we must sign-extend all operands, growing the program size considerably.

Moreover, in some other cases we must also truncate results. For instance, consider this program:

test(bits64 buffer)
  bits8 a = 0;
  bits8 b = %not(a);
  bits8 c = %shrl(b, 4::bits8);
  return (c);
}

This program should return 0x0f. However, if we only have a word-size complement operation we would need to take care to truncate b to 8-bits, lest the shrl inappropriately shift ones into its result (yielding the result 0xff).

Edited Nov 12, 2021 by Ben Gamari

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information