Skip to content

Small comparisons are inefficient on AArch64

While looking at #23222 (closed) I noticed the following code in stg_BLACKHOLE_info:

   0x00000000015291b0 <+104>:   sxtw    x18, w18                                                                                                                                            
   0x00000000015291b4 <+108>:   mov     w14, #0x1a                      // #26                                                                                                              
   0x00000000015291b8 <+112>:   cmp     x18, x14                                                                                                                                            
   0x00000000015291bc <+116>:   b.cc    0x15292ac <stg_BLACKHOLE_info+356>  // b.lo, b.ul, b.last                                                                                           
   0x00000000015291c0 <+120>:   mov     w14, #0x1c                      // #28                                                                                                              
   0x00000000015291c4 <+124>:   cmp     x18, x14                                                                                                                                            
   0x00000000015291c8 <+128>:   b.cc    0x1529298 <stg_BLACKHOLE_info+336>  // b.lo, b.ul, b.last                                                                                           
   0x00000000015291cc <+132>:   mov     w14, #0x1d                      // #29                                                                                                              
   0x00000000015291d0 <+136>:   cmp     x18, x14                                              
   0x00000000015291d4 <+140>:   b.cc    0x15292a4 <stg_BLACKHOLE_info+348>  // b.lo, b.ul, b.last                                                                                           
   0x00000000015291d8 <+144>:   mov     w14, #0x40                      // #64                
   0x00000000015291dc <+148>:   cmp     x18, x14                                              
   0x00000000015291e0 <+152>:   b.ne    0x1529288 <stg_BLACKHOLE_info+320>  // b.any          

These mov/cmp sequences should rather be cmp instructions with immediate operands.

For instance, given similar code clang will produce:

        cmp     w0, #26
        b.eq    .LBB0_4
        cmp     w0, #64
        b.eq    .LBB0_5
        cmp     w0, #28
        b.ne    .LBB0_6
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information