Define sizeOf Char# to always be 4 byte.
Motivation
Smaller code, which without any other changes is better code.
For example we start with this function.
elemCString# :: Char# -> Addr# -> Bool
elemCString# c addr =
check 0#
where
check :: Int# -> Bool
check nh
| isTrue# (ch `eqChar#` '\0'#) = False
| isTrue# (ch `eqChar#` c ) = True
| True = check (nh +# 1#)
where
!ch = indexCharOffAddr# addr nh
It compiles as expected to a tight loop:
[M.elemCString2#_entry() { // [R3, R2]
{ info_tbls: [(c1Ra,
label: M.elemCString2#_info
rep: HeapRep static { Fun {arity: 2 fun_type: ArgSpec 12} }
srt: Nothing)]
stack_info: arg_space: 8 updfr_space: Just 8
}
{offset
c1Ra: // global
_s1Q6::I64 = R3;
_s1Q5::I64 = R2;
_s1Q8::I64 = 0;
goto c1Re;
c1Re: // global
_s1Q9::I64 = %MO_UU_Conv_W8_W64(I8[_s1Q6::I64 + _s1Q8::I64]);
if (_s1Q9::I64 != 0) goto c1Rm; else goto c1Rx;
c1Rm: // global
if (_s1Q9::I64 != _s1Q5::I64) goto c1Rt; else goto c1Ru;
c1Rt: // global
_s1Q8::I64 = _s1Q8::I64 + 1;
goto c1Re;
c1Ru: // global
R1 = GHC.Types.True_closure+2;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
c1Rx: // global
R1 = GHC.Types.False_closure+1;
call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
}
},
However we expand all Char# values to 64 bit! This gives us this assembly:
a0: 31 c0 xor %eax,%eax
a2: eb 03 jmp a7 <M_elemCString2zh_info+0x7>
a4: 48 ff c0 inc %rax
a7: 0f b6 1c 06 movzbl (%rsi,%rax,1),%ebx
ab: 48 85 db test %rbx,%rbx
ae: 74 0d je bd <M_elemCString2zh_info+0x1d>
b0: 4c 39 f3 cmp %r14,%rbx
b3: 75 ef jne a4 <M_elemCString2zh_info+0x4>
b5: bb 02 00 00 00 mov $0x2,%ebx
ba: ff 65 00 jmpq *0x0(%rbp)
bd: bb 01 00 00 00 mov $0x1,%ebx
c2: ff 65 00 jmpq *0x0(%rbp)
There are three instructions which currently require the 64bit prefix: mov, test and cmp
If we interpret Char# as 32bits the code is 3 bytes smaller, which means it would fit perfectly in 32bytes.
Proposal
Define the size of Char# as min (4 byte, wordSize)