Bad register allocation for inner loop
Summary
This definition compiles suboptimally:
{-# language MagicHash #-}
{-# language BangPatterns #-}
{-# language ScopedTypeVariables #-}
-- {-# options -fregs-graph #-}
module M where
import GHC.CString
import GHC.Exts
check :: Char# -> Char# -> Addr# -> Bool
check c ch addr
| isTrue# (ch `eqChar#` '\0'#) = False
| isTrue# (ch `eqChar#` c ) = True
| True =
let !addr' = (addr `plusAddr#` 1#)
!ch' = indexCharOffAddr# addr' 0#
in check c ch' addr'
We get this assembly:
check_ryG_info:
_cJC:
jmp _cJv
_cJJ:
leaq 1(%rdi),%rax
movq %rax,%rbx
movzbl (%rax),%eax
_nJS:
movq %rbx,%rdi
movq %rax,%rsi
_cJv:
testq %rsi,%rsi
je _cJB
_cJA:
cmpq %r14,%rsi
jne _cJJ
_cJK:
movl $GHC.Types.True_closure+2,%ebx
jmp *(%rbp)
_cJB:
movl $GHC.Types.False_closure+1,%ebx
jmp *(%rbp)
This code looks fine except for the register fixup block _nJS
inserted by the register allocator.
I assume this happens because the register allocator follows the branches in this order:
Steps to reproduce
Compiling this with -O2
Expected behavior
GHC should prioritize loops and try to allocate register for the looping control flow first.
This requires some understanding of loops. While this isn't trivial the code to detect loops already exists inside GHC (mkGlobalWeights
).
So it's "only" a matter of applying it to the code before we do register allocation instead of after.
Environment
- GHC version used: HEAD
Optional:
- Operating System:
- System Architecture: