Skip to content

Investigate the performance impact of code alignment

Maybe ghc's performance also varies due to reasons like that:

https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues

The gist of the article is that tight loops can have significantly different performance depending on whether the location of the assembly instructions themselves cross a cache line.

I would not have expected this to make double-digit percentage differences.

From #ghc:

bgamari:       nh2[m], there are nofib tests where this is very likely the cause of a good portion of the variant
angerman:      nh2[m]: that linked LLVM talk from 2016 makes me not want to have to deal with that...
AndreasK:      nh2[m]: It's a real issue. But atm I think ghc at least in the native codegen make no real attempt to optimize for this
thoughtpolice: GHC does not carry knowledge of alignment or anything, no. I’m not sure how difficult this is to suss out, but at least making sure every branch target does not cross a cache line is probably a good start
thoughtpolice: Well, far jump, e.g. a call to a function. not sure how TNTC fits into this story, tbqh
Trac metadata
Trac field Value
Version 8.2.2
Type Task
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler (CodeGen)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC nh2
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information