Use appropriatly sized comparison instruction for small values.
GHC currently defaults all comparisons originating from Cmm switch statements to 64bit on x64.
This incurs a small overhead in instruction size. Fixing this manually gave a speedup of ~1,5% in microbenchmarks.
In detail we generate Cmm of the form:
_s8Dg::P64 = R1;
_c8EF::P64 = _s8Dg::P64 & 7;
switch [1 .. 2] _c8EF::P64 {
case 1 : goto c8Ey;
case 2 : goto c8EC;
}
Which results in assembly like:
andl $7,%ebx
cmpq $1,%rbx
It's obvious that c8EF fits into a byte, but is sized up to 64 bits. Changing this would enable us to use cmpl instead of cmpq and save us a byte on each comparison.
While this isn't major in my microbenchmarks it resultet in a speedup of ~1,5% for such constructs in inner loops.
Trac metadata
| Trac field | Value |
|---|---|
| Version | |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (CodeGen) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture |