More dense jump table encoding

While looking at the code generation changes necessary for #21019 (closed) I noticed that gcc and clang both exploit an optimisation in their treatment of PIC jump tables that we currently don't. Specifically, consider a program like:

int extract_b(int a, int b, int c) {
    switch (a) {
    case 0: return b*c;
    case 1: return b+c;
    case 2: return b*c;
    case 3: return a+b;
    case 4: return a+2;
    case 5: return b;
    }
}

gcc will produce a jump table consisting of .long offsets:

        .text
        .p2align 4
        .globl  extract_b
        .type   extract_b, @function
extract_b:
        .cfi_startproc
        cmpl    $5, %edi
        ja      .L2
        leaq    .L4(%rip), %rcx
        movl    %edi, %edi
        movslq  (%rcx,%rdi,4), %rax
        addq    %rcx, %rax
        jmp     *%rax
        .section        .rodata
        .align 4
        .align 4
.L4:
        .long   .L7-.L4
        .long   .L8-.L4
        .long   .L7-.L4
        .long   .L6-.L4
        .long   .L5-.L4
        .long   .L9-.L4

By contrast, GHC produces a table of jump table of .quads. The former will be slightly cache-efficient and therefore may be slightly faster in tight loops.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information