Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
GHC
GHC
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 4,393
    • Issues 4,393
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
    • Iterations
  • Merge Requests 374
    • Merge Requests 374
  • Requirements
    • Requirements
    • List
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Operations
    • Operations
    • Incidents
    • Environments
  • Analytics
    • Analytics
    • CI / CD
    • Code Review
    • Insights
    • Issue
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #19153

Closed
Open
Opened Jan 01, 2021 by Neo@nmichael44

Optimization idea for case expression dispatch

Motivation

Some C compilers implement an optimization that ghc could benefit from. A case statement like the following (where we have many different cases returning the same expression), is currently implemented in ghc as a series of compare and jump statements following the balanced binary tree based on the ints in the case statement (depending on how dense the range is ghc is smart enough to combine the binary tree with an array lookup part which is great). (In the example below the ints in question are not considered dense so we get a pure tree.)

opt :: Int -> Int
opt 0 = 10
opt 1 = 10
opt 2 = 10
opt 3 = 10  
opt 23 = 10
opt 44 = 10 
opt 60 = 10
opt 61 = 10
opt _ =  100

the ghc generated assembly is basically the tree search:

Opt.$wopt_info:
_c7is:
	cmpq $45,%r14
	jge _u7iB
_u7iv:
	cmpq $4,%r14
	jge _u7iz
_u7iw:
	cmpq $3,%r14
	jge _c7ik
_u7ix:
	testq %r14,%r14
	jge _c7ik
_c7ij:
	movl $100,%ebx
	jmp *(%rbp)
_u7iz:
	cmpq $44,%r14
	jge _c7ik
_u7iA:
	cmpq $23,%r14
	jne _c7ij
_c7ik:
	movl $10,%ebx
	jmp *(%rbp)
_u7iB:
	cmpq $61,%r14
	jge _u7iD
_u7iC:
	cmpq $60,%r14
	jl _c7ij
	jmp _c7ik
_u7iD:
	cmpq $62,%r14
	jl _c7ik
	jmp _c7ij
	.size Opt.$wopt_info, .-Opt.$wopt_info

Proposal

Here is the same code in C:

int opt(long c) {
  switch (c) {
    case 0: return 10;
    case 1: return 10;
    case 2: return 10;
    case 3: return 10;  
    case 23: return 10;
    case 44: return 10;
    case 60: return 10;
    case 61: return 10;
    default: return 100;
  }
}

and here is how a clever C compiler implements it:

static const long magicConstant = 0b1111000000000000000000010000000000000000000010000000000000001100;

int opt2(long c) {

  if (((unsigned long) c) > 61) // negative numbers also taken care of here
    return 100;                 // by the cast to unsigned (as unsigned they are big!)

  if (((magicConstant << c) < 0))
    return 10;
  else
    return 100;
}

This results in the much shorter assembly below:

opt2:
.LFB24:
	.cfi_startproc
	endbr64
	movl	$100, %eax
	cmpq	$61, %rdi
	ja	.L6
	movabsq	$-1152920405094694900, %rax
	movl	%edi, %ecx
	salq	%cl, %rax
	sarq	$63, %rax
	andl	$-90, %eax
	addl	$100, %eax
.L6:
	ret

The number of instructions executed is slightly less here (and always constant independent of the number of cases), but more importantly this is straight line code and would easily beat the binary tree jumpy code.

The idea is to construct a magic constant that when shifted left by c (our input) will cause the result to go negative (the most significant bit becomes 1) if and only if c equals one of the integers in the case statement. This optimization can only be done if the integers we are comparing against are small (in the region [0,63] -- the number of bits in one machine word).

This optimization may be applicable to case statements on ADT/GADTs since we rarely have ADTs with more than 64 cases.

Edited Jan 12, 2021 by Neo
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: ghc/ghc#19153