Generate lookup tables instead of jump tables where possible.
Motivation
Sometimes we end up with Cmm code of this kind:
uXSY: // global
switch [0 .. 22] _cXRy::I64 {
case 0 : goto cXM8;
case 1 : goto cXMc;
...
case 17, 18, 19, 20, 21 : goto cXMQ;
case 22 : goto cXNy;
}
cXNy: // global
_sOUk::P64 = lvl1345_rNxA_closure+1;
goto sOUi;
cXMQ: // global
_sOUk::P64 = lvl1344_rNxz_closure+1;
goto sOUi;
... a few more of this kind
cXM8: // global
_sOUk::P64 = lvl1339_rNxu_closure+1;
goto sOUi;
cXM4: // global
_sOUk::P64 = lvl1338_rNxt_closure+1;
goto sOUi;
(Example is from GHC's lexer)
The issue with the current code
Currently this is compiled into the following essential pieces:
#A indirect jump, branching on the value
_uXSY:
jmp *_nXUl(,%rbx,8)
# The jump table
.section .rdata
_nXUl:
.quad _cXM8
.quad _cXMc
...
.quad _cXMQ
.quad _cXNy
#And the jump target blocks
_cXMQ:
movl $lvl1344_rNxz_closure+1,%ebx
_nXUs:
movq %rax,%r14
movq %rbx,%rax
jmp _sOUi
_cXNy:
movl $lvl1345_rNxA_closure+1,%ebx
_nXUr:
movq %rax,%r14
movq %rbx,%rax
jmp _sOUi
#and here would be 6 more.
I like how this example is made even better since we accidentally duplicate the register shuffling in each of those blocks.
Proposal
However instead the code could be simply replaced with this
uXSY: // global
_sOUk::P64 = P64[table+_cXRy];
goto sOUi;
// + The lookup table
Which is just so much better in all kinds of ways.