Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in / Register
  • GHC GHC
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 5,249
    • Issues 5,249
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 579
    • Merge requests 579
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Glasgow Haskell CompilerGlasgow Haskell Compiler
  • GHCGHC
  • Issues
  • #15124
Closed
Open
Issue created May 05, 2018 by Andreas Klebinger@AndreasKDeveloper

Improve block layout for the NCG

The current codegen tries to place two blocks A and B next to each other if they are "close".

Drawbacks of the current algorithm

Blocks are only considered close if one jumps to the other via a unconditional jump instructions. This is a fast and reasonable approximation. But we might be able to do better.

Calls are not considered even if they always return to the same block.

For example it's clear that if foo == bar we jump to C here:

A:
	movq $block_C_info,-16(%rbp)
        cmp foo,bar
	jne D
B:
	jmp *(%rbx) #Returns to C

        ...

.align 8
	.infotable
block_C_info:
C:
        doThings

Conditional jumps are never considered.

The following block either jumps directly to c3mQ or returns there from a call. Yet that is completely ignored by the current algorithm.

_c3mG:
	movq $block_c3mQ_info,-8(%rbp)
        ...
	jne _c3mQ
	jmp *(%rbx) #returns to c3mQ

Likelyhood of branches is not considered.

We track information about branches which are rarely taken at the cmm level. However this information is currently lost during the Cmm -> Asm transformation which happens before we do code layout.

This means for Cmm code where one branch is never executed the branch that's never executed might be the only one considered relevant.

This happens for example for this stack check:

    if ((Sp + -40) >= SpLim) (likely: True) goto c5PS; else goto c5Q3;

Potential gains

If nothing else this would allow some Cmm optimizations which are made useless by the current code layout algorithm. I have hit cases where the block layout algorithm turned optimizations into pessimizations of 2-5% by pulling obvious loops apart.

Given that the scenario that leads to loops being pulled apart doesn't seem that unlikely I assume some loops are also pulled apart like this in HEAD.

Cleaning this up ideally would also make it possible to move blocks which are not part of loops out of them.

Trac metadata
Trac field Value
Version 8.2.2
Type Task
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler (NCG)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking