Investigate the performance impact of code alignment
Maybe ghc's performance also varies due to reasons like that:
https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues
The gist of the article is that tight loops can have significantly different performance depending on whether the location of the assembly instructions themselves cross a cache line.
I would not have expected this to make double-digit percentage differences.
From #ghc
:
bgamari: nh2[m], there are nofib tests where this is very likely the cause of a good portion of the variant
angerman: nh2[m]: that linked LLVM talk from 2016 makes me not want to have to deal with that...
AndreasK: nh2[m]: It's a real issue. But atm I think ghc at least in the native codegen make no real attempt to optimize for this
thoughtpolice: GHC does not carry knowledge of alignment or anything, no. I’m not sure how difficult this is to suss out, but at least making sure every branch target does not cross a cache line is probably a good start
thoughtpolice: Well, far jump, e.g. a call to a function. not sure how TNTC fits into this story, tbqh
Trac metadata
Trac field | Value |
---|---|
Version | 8.2.2 |
Type | Task |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | Compiler (CodeGen) |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | nh2 |
Operating system | |
Architecture |