Skip to content

Bad register allocation for inner loop

Summary

This definition compiles suboptimally:


{-# language MagicHash #-}
{-# language BangPatterns #-}
{-# language ScopedTypeVariables #-}
-- {-# options -fregs-graph #-}


module M where

import GHC.CString
import GHC.Exts

    check :: Char# -> Char# -> Addr# -> Bool
    check c ch addr
      | isTrue# (ch `eqChar#` '\0'#) = False
      | isTrue# (ch `eqChar#` c    ) = True
      | True                         =
          let !addr' = (addr `plusAddr#` 1#)
              !ch' = indexCharOffAddr# addr' 0#
          in check c ch' addr'

We get this assembly:

check_ryG_info:
_cJC:
	jmp _cJv
_cJJ:
	leaq 1(%rdi),%rax
	movq %rax,%rbx
	movzbl (%rax),%eax
_nJS:
	movq %rbx,%rdi
	movq %rax,%rsi
_cJv:
	testq %rsi,%rsi
	je _cJB
_cJA:
	cmpq %r14,%rsi
	jne _cJJ
_cJK:
	movl $GHC.Types.True_closure+2,%ebx
	jmp *(%rbp)
_cJB:
	movl $GHC.Types.False_closure+1,%ebx
	jmp *(%rbp)

This code looks fine except for the register fixup block _nJS inserted by the register allocator.

I assume this happens because the register allocator follows the branches in this order:

image

Steps to reproduce

Compiling this with -O2

Expected behavior

GHC should prioritize loops and try to allocate register for the looping control flow first.

This requires some understanding of loops. While this isn't trivial the code to detect loops already exists inside GHC (mkGlobalWeights).

So it's "only" a matter of applying it to the code before we do register allocation instead of after.

Environment

  • GHC version used: HEAD

Optional:

  • Operating System:
  • System Architecture:
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information