Missed optimization in record selection of ADT

Motivation

Consider an ADT like the following:

data F = A { aaaa :: Int, a1 :: String }
       | B { aaaa :: Int, b1 :: Char }
       | C { aaaa :: Int, c1 :: Double, c2 :: Int }
       | D { aaaa :: Int, d1 :: [Char] }
       | E { aaaa :: Int, e1 :: [Char] }

At issue is the code we generate for the selector

aaaa :: F -> Int

The code generated (see below) does a case statement on the tag of F (in this case a binary search) and then reads the aaaa field in each case. But since the field aaaa is at the same location in all constructors ghc could simply blindly read from that location and skip the switch statement altogether.

This wouldn't be possible if ghc choose to rearrange the fields of each case but I don't see why we would want to do that. If the fields are not rearranged, then this optimization is possible. Perhaps it is not expressible in Core but maybe in some of the lower languages it could be possible.

block_c7AF_info:
 _c7AF:
         movq %rbx,%rax
         andl $7,%eax
         cmpq $4,%rax
         jae _u7B9
 _u7B7:
         cmpq $3,%rax
         jb _u7B8
 _c7AL:
         movq 5(%rbx),%rbx
         andq $-8,%rbx
         addq $8,%rbp
         jmp *(%rbx)
 _u7B8:
         cmpq $2,%rax
         jae _c7AK
 _c7AJ:
         movq 7(%rbx),%rbx
         andq $-8,%rbx
         addq $8,%rbp
         jmp *(%rbx)
 _c7AK:
         movq 6(%rbx),%rbx
         andq $-8,%rbx
         addq $8,%rbp
         jmp *(%rbx)
 _u7B9:
         cmpq $5,%rax
         jae _c7AN
 _c7AM:
         movq 4(%rbx),%rbx
         andq $-8,%rbx
         addq $8,%rbp
         jmp *(%rbx)
 _c7AN:
         movq 3(%rbx),%rbx
         andq $-8,%rbx
         addq $8,%rbp
         jmp *(%rbx)
 _c7AQ:
         leaq X.aaaa_closure(%rip),%rbx
         jmp *-8(%r13)
         .size X.aaaa_info, .-X.aaaa_info

Edited Aug 09, 2021 by Neo

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information