GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2019-07-07T18:31:04Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/11294T9430 fails on ARM2019-07-07T18:31:04ZBen GamariT9430 fails on ARMThe primops testcase for `timesWord2#` in `T9430` fails on ARM.
```
cd ./primops/should_run && ./T9430 </dev/null > T9430.run.stdout 2> T9430.run.stderr
Wrong exit code (expected 0 , actual 1 )
Stdout:
Stderr:
T9430: Error for times...The primops testcase for `timesWord2#` in `T9430` fails on ARM.
```
cd ./primops/should_run && ./T9430 </dev/null > T9430.run.stdout 2> T9430.run.stderr
Wrong exit code (expected 0 , actual 1 )
Stdout:
Stderr:
T9430: Error for timesWord2# : Expected 1 and 0 but got 0 and 0
CallStack (from ImplicitParams):
error, called at T9430.hs:55:22 in main:Main
```
It looks like this is probably an 32-bit LLVM code generator bug although I have yet to confirm this.8.0.1Ben GamariBen Gamarihttps://gitlab.haskell.org/ghc/ghc/-/issues/10953Switch to LLVM 3.72019-07-07T18:32:55ZerikdSwitch to LLVM 3.7LLVM 3.6 is broken on AArch64/Arm64 and LLVM 3.7 was released in August.
I already have git master building with LLVM-3.7 on x86_64/linux and can test on numerous others.
<details><summary>Trac metadata</summary>
| Trac field ...LLVM 3.6 is broken on AArch64/Arm64 and LLVM 3.7 was released in August.
I already have git master building with LLVM-3.7 on x86_64/linux and can test on numerous others.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 7.11 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | bgamari |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Switch to LLVM 3.7","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[],"milestone":"8.0.1","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"erikd"},"version":"7.11","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":["bgamari"],"type":"Bug","description":"LLVM 3.6 is broken on AArch64/Arm64 and LLVM 3.7 was released in August.\r\n\r\nI already have git master building with LLVM-3.7 on x86_64/linux and can test on numerous others.\r\n\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1erikderikdhttps://gitlab.haskell.org/ghc/ghc/-/issues/10394LLVM mangler doesn't mangle AVX instructions2019-07-07T18:36:11ZdobenourLLVM mangler doesn't mangle AVX instructionsThe LLVM mangler does not currently transform AVX instructions on x86-64 platforms, due to a missing \#include. Also, it is significantly more complicated than necessary, due to the file into sections (not needed anymore), and is sensiti...The LLVM mangler does not currently transform AVX instructions on x86-64 platforms, due to a missing \#include. Also, it is significantly more complicated than necessary, due to the file into sections (not needed anymore), and is sensitive to the details of the whitespace in the assembly.
I have attached a modified mangler that I believe to be simpler and more robust. I have not tested it, though, as I do not have a recent enough version of LLVM on my machine.
I am marking this as \`Runtime crash' because that is what would happen if the unchanged AVX instructions made their way into the executable.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 7.11 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"LLVM mangler doesn't mangle AVX instructions","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[],"milestone":"7.12.1","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.11","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The LLVM mangler does not currently transform AVX instructions on x86-64 platforms, due to a missing #include. Also, it is significantly more complicated than necessary, due to the file into sections (not needed anymore), and is sensitive to the details of the whitespace in the assembly.\r\n\r\nI have attached a modified mangler that I believe to be simpler and more robust. I have not tested it, though, as I do not have a recent enough version of LLVM on my machine.\r\n\r\nI am marking this as `Runtime crash' because that is what would happen if the unchanged AVX instructions made their way into the executable.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/10234armhf : Validate fails during bindisttest configure2019-07-07T18:36:58Zerikdarmhf : Validate fails during bindisttest configureThe failure is:
```
/usr/bin/install -c -m 644 libraries/prologue.txt "/home/erikd/ghc-git/bindisttest/install dir/share/doc/ghc/html/libraries/"
/usr/bin/install -c -m 755 libraries/gen_contents_index "/home/erikd/ghc-git/bindistte...The failure is:
```
/usr/bin/install -c -m 644 libraries/prologue.txt "/home/erikd/ghc-git/bindisttest/install dir/share/doc/ghc/html/libraries/"
/usr/bin/install -c -m 755 libraries/gen_contents_index "/home/erikd/ghc-git/bindisttest/install dir/share/doc/ghc/html/libraries/"
bindisttest/"install dir"/bin/runghc bindisttest/HelloWorld > bindisttest/output
diff -U 1 bindisttest/output bindisttest/expected_output
bindisttest/"install dir"/bin/ghc --make bindisttest/HelloWorld
[1 of 1] Compiling Main ( bindisttest/HelloWorld.lhs, bindisttest/HelloWorld.o )
<no location info>:
Warning: Couldn't figure out LLVM version!
Make sure you have installed LLVM
ghc: could not execute: opt
```
Since this is armhf, this was built with llvm, and the configure process found the correct versions of the llvm tools; ie `/usr/bin/llc-3.6` and `/usr/bin/opt-3.6`. Furthermore, file `inplace/lib/settings` file specifies the correct versions.
However, file `bindisttest/install dir/lib/ghc-7.11.20150402/settings` file just uses `llc` and `opt` which don't actually exist.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ------------ |
| Version | 7.11 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Build System |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Validate fails on armhf","status":"New","operating_system":"","component":"Build System","related":[],"milestone":"7.12.1","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.11","keywords":["llvm"],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The failure is:\r\n\r\n{{{\r\n/usr/bin/install -c -m 644 libraries/prologue.txt \"/home/erikd/ghc-git/bindisttest/install dir/share/doc/ghc/html/libraries/\"\r\n/usr/bin/install -c -m 755 libraries/gen_contents_index \"/home/erikd/ghc-git/bindisttest/install dir/share/doc/ghc/html/libraries/\"\r\nbindisttest/\"install dir\"/bin/runghc bindisttest/HelloWorld > bindisttest/output\r\ndiff -U 1 bindisttest/output bindisttest/expected_output\r\nbindisttest/\"install dir\"/bin/ghc --make bindisttest/HelloWorld\r\n[1 of 1] Compiling Main ( bindisttest/HelloWorld.lhs, bindisttest/HelloWorld.o )\r\n\r\n<no location info>:\r\n Warning: Couldn't figure out LLVM version!\r\n Make sure you have installed LLVM\r\nghc: could not execute: opt\r\n}}}\r\n\r\nSince this is armhf, this was built with llvm, and the configure process found the correct versions of the llvm tools; ie `/usr/bin/llc-3.6` and `/usr/bin/opt-3.6`. Furthermore, file `inplace/lib/settings` file specifies the correct versions.\r\n\r\nHowever, file `bindisttest/install dir/lib/ghc-7.11.20150402/settings` file just uses `llc` and `opt` which don't actually exist.\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/10170Find versioned versions of LLVM tools2019-07-07T18:37:15ZerikdFind versioned versions of LLVM toolsSince it has become more widely know that the LLVM developers often make breaking changes to the IR language between releases, some Linu distributions (eg Debain and hence Ubuntu) have started doing multi-version LLVM installs.
For inst...Since it has become more widely know that the LLVM developers often make breaking changes to the IR language between releases, some Linu distributions (eg Debain and hence Ubuntu) have started doing multi-version LLVM installs.
For instance on Debian, the llvm-3.6 programs are installed in `/usr/lib/llvm-3.6/bin` as `llc` and `opt`.
I propose that the `configure.ac` and `aclocal.m4` configuration goop be modified to look for `llc` and `opt` in these versioned `/usr/lib/llvm-X.Y` locations. I'm also willing to do the hard yards on implementing this if people think its a good idea.
This can be seen as a step along the way to more stringent LLVM versioning requirements.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 7.11 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Find versioned versions of LLVM tools","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[],"milestone":"7.12.1","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"erikd"},"version":"7.11","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"Since it has become more widely know that the LLVM developers often make breaking changes to the IR language between releases, some Linu distributions (eg Debain and hence Ubuntu) have started doing multi-version LLVM installs. \r\n\r\nFor instance on Debian, the llvm-3.6 programs are installed in `/usr/lib/llvm-3.6/bin` as `llc` and `opt`.\r\n\r\nI propose that the `configure.ac` and `aclocal.m4` configuration goop be modified to look for `llc` and `opt` in these versioned `/usr/lib/llvm-X.Y` locations. I'm also willing to do the hard yards on implementing this if people think its a good idea.\r\n\r\nThis can be seen as a step along the way to more stringent LLVM versioning requirements.\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1erikderikdhttps://gitlab.haskell.org/ghc/ghc/-/issues/9430implement more arithmetic operations natively in the LLVM backend2019-07-07T18:40:26Zrwbartonimplement more arithmetic operations natively in the LLVM backendThere are a number of arithmetic operations that have native implementations on x86 but use the generic fallback on LLVM. Implementing these with LLVM intrinsics could improve small Integer performance on ARM substantially!
```
MO_Add2 ...There are a number of arithmetic operations that have native implementations on x86 but use the generic fallback on LLVM. Implementing these with LLVM intrinsics could improve small Integer performance on ARM substantially!
```
MO_Add2 @llvm.uadd.with.overflow.*
MO_AddIntC @llvm.sadd.with.overflow.*
MO_SubIntC @llvm.ssub.with.overflow.*
MO_U_Mul2 mul i64/i128?
MO_U_QuotRem2 udiv i64/i128?
```
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 7.9 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"implement more arithmetic operations natively in the LLVM backend","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"7.9","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"FeatureRequest","description":"There are a number of arithmetic operations that have native implementations on x86 but use the generic fallback on LLVM. Implementing these with LLVM intrinsics could improve small Integer performance on ARM substantially!\r\n{{{\r\nMO_Add2 @llvm.uadd.with.overflow.*\r\nMO_AddIntC @llvm.sadd.with.overflow.*\r\nMO_SubIntC @llvm.ssub.with.overflow.*\r\nMO_U_Mul2 mul i64/i128?\r\nMO_U_QuotRem2 udiv i64/i128?\r\n}}}\r\n","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Michal TerepetaMichal Terepetahttps://gitlab.haskell.org/ghc/ghc/-/issues/7610Cross compilation support for LLVM backend2019-07-07T18:49:00ZdtereiCross compilation support for LLVM backendTop level bug to track supporting cross compilation in LLVM backend.
Mostly this shouldn't be too bad but I haven't tried it and know of at least a few significant issues.
<details><summary>Trac metadata</summary>
| Trac field ...Top level bug to track supporting cross compilation in LLVM backend.
Mostly this shouldn't be too bad but I haven't tried it and know of at least a few significant issues.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 7.6.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | #7608 |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Cross compilation support for LLVM backend","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[7608],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"dterei"},"version":"7.6.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"Top level bug to track supporting cross compilation in LLVM backend.\r\n\r\nMostly this shouldn't be too bad but I haven't tried it and know of at least a few significant issues.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/7608LLVM only handles a hard-coded list of triples.2019-07-07T18:49:00ZsingpolymaLLVM only handles a hard-coded list of triples.LLVM simply has a hard-coded list of triples for supported platforms in compiler/llvmGen/LlvmCodeGen/Ppr.hs :: moduleLayout.
Apparently this information can potentially be sourced by configure / autotools instead. This may be a better w...LLVM simply has a hard-coded list of triples for supported platforms in compiler/llvmGen/LlvmCodeGen/Ppr.hs :: moduleLayout.
Apparently this information can potentially be sourced by configure / autotools instead. This may be a better way forward rather than adding code for each platform.8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/7297LLVM incorrectly hoisting loads2019-07-07T18:50:28ZdtereiLLVM incorrectly hoisting loadstest 367_letnoescape fails under LLVM as a load of the !HpLim register is hoisted out of the loop. So yielding is never done.
What I am not sure about right now is the best way to fix. Loads in LLVM can be annotated in a few different w...test 367_letnoescape fails under LLVM as a load of the !HpLim register is hoisted out of the loop. So yielding is never done.
What I am not sure about right now is the best way to fix. Loads in LLVM can be annotated in a few different ways to fix this and not sure which one is the most 'correct'.
All the following work:
- mark the load as volatile. (seems to give nicest code as well)
- mark the load as atomic with either monotonic or seq_cst ordering.
- mark the load as both volatile and atomic.
This bug while only affecting a single test case seems very serious and potentially indicative of a large problem. How well are we communicating the load/store threaded semantics to LLVM?
And what semantics do we need to communicate? I think we are fine other than the STG registers...
So making a bug for now as I don't know yet the best way to proceed without dedicating some time to reading LLVM docs and probably talking to the LLVM devs as the docs on the memory model are fairly confusing.
e.g., Code in question:
Bad version (LBB0_1 loops forever as load hoisted out):
```
r1Uf_info: # @r1Uf_info
# BB#0: # %c1Vy
movq 144(%r13), %rax
decq %r14
.align 16, 0x90
.LBB0_1: # %tailrecurse
# =>This Inner Loop Header: Depth=1
incq %r14
testq %rax, %rax
jne .LBB0_1
# BB#2: # %c1VD
movq -8(%r13), %rax
movl $r1Uf_closure, %ebx
jmpq *%rax # TAILCALL
```
Code when marked with atomic (either monatonic or seq_cst) or both atomic and volatile:
```
r1Uf_info: # @r1Uf_info
# BB#0: # %c1Vy
decq %r14
.align 16, 0x90
.LBB0_1: # %tailrecurse
# =>This Inner Loop Header: Depth=1
incq %r14
movq 144(%r13), %rax
testq %rax, %rax
jne .LBB0_1
# BB#2: # %c1VD
movq -8(%r13), %rax
movl $r1Uf_closure, %ebx
jmpq *%rax # TAILCALL
```
Code when marked volatile:
```
r1Uf_info: # @r1Uf_info
# BB#0: # %c1Vy
decq %r14
.align 16, 0x90
.LBB0_1: # %tailrecurse
# =>This Inner Loop Header: Depth=1
incq %r14
cmpq $0, 144(%r13)
jne .LBB0_1
# BB#2: # %c1VD
movq -8(%r13), %rax
movl $r1Uf_closure, %ebx
jmpq *%rax # TAILCALL
```8.0.1dtereidtereihttps://gitlab.haskell.org/ghc/ghc/-/issues/5786Dynamic way fails when GHC built with LLVM backend2022-01-20T15:29:19ZdtereiDynamic way fails when GHC built with LLVM backendIf I build GHC with the LLVM backend then the 'Dyn' testsuite way fails completely (tested on x64).
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| V...If I build GHC with the LLVM backend then the 'Dyn' testsuite way fails completely (tested on x64).
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 7.4.1-rc1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | #5757 |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Dynanmic way fails when GHC built with LLVM backend","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[5757],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"dterei"},"version":"7.4.1-rc1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"If I build GHC with the LLVM backend then the 'Dyn' testsuite way fails completely (tested on x64).","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1dtereidtereihttps://gitlab.haskell.org/ghc/ghc/-/issues/5567LLVM: Improve alias analysis / performance2022-07-13T21:11:11ZdtereiLLVM: Improve alias analysis / performance- LLVM doesn't generate as good as code as we feel it should in many situations
- Why?
- We've often felt its a alias anlysis issue.
- I'm a little more doubtful of that than others (I feel its part of the bigger problem, not the whole t...- LLVM doesn't generate as good as code as we feel it should in many situations
- Why?
- We've often felt its a alias anlysis issue.
- I'm a little more doubtful of that than others (I feel its part of the bigger problem, not the whole thing).
- I think there may be some register allocation / instruction selection / live range splitting issue going on.
- We could also do with looking at what optimisation passes we should run and in what order...
Here is some work Max did on the alias issue, his results for nofib weren't good:
http://blog.omega-prime.co.uk/?p=135
So this ticket is just a high level ticket about figuring out and improving the performance of LLVM backend.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"LLVM: Improve alias analysis / performance","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"dterei"},"version":"","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Task","description":" * LLVM doesn't generate as good as code as we feel it should in many situations\r\n * Why?\r\n * We've often felt its a alias anlysis issue.\r\n * I'm a little more doubtful of that than others (I feel its part of the bigger problem, not the whole thing).\r\n * I think there may be some register allocation / instruction selection / live range splitting issue going on.\r\n * We could also do with looking at what optimisation passes we should run and in what order...\r\n\r\nHere is some work Max did on the alias issue, his results for nofib weren't good:\r\n\r\nhttp://blog.omega-prime.co.uk/?p=135\r\n\r\nSo this ticket is just a high level ticket about figuring out and improving the performance of LLVM backend.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1dtereidtereihttps://gitlab.haskell.org/ghc/ghc/-/issues/4308LLVM compiles Updates.cmm badly2019-07-07T18:59:36ZdtereiLLVM compiles Updates.cmm badlySimon M. reported that compiled rts/Updates.cmm on x86-64 with the LLVM backend produced some pretty bad code. The ncg produces this:
```
stg_upd_frame_info:
.Lco:
movq 8(%rbp),%rax
addq $16,%rbp
movq %rbx,8(%rax)
...Simon M. reported that compiled rts/Updates.cmm on x86-64 with the LLVM backend produced some pretty bad code. The ncg produces this:
```
stg_upd_frame_info:
.Lco:
movq 8(%rbp),%rax
addq $16,%rbp
movq %rbx,8(%rax)
movq $stg_BLACKHOLE_info,0(%rax)
movq %rax,%rcx
andq $-1048576,%rcx
movq %rax,%rdx
andq $1044480,%rdx
shrq $6,%rdx
orq %rcx,%rdx
cmpw $0,52(%rdx)
jne .Lcf
jmp *0(%rbp)
.Lcf:
[...]
```
The LLVM backend produces this though:
```
stg_upd_frame_info: # @stg_upd_frame_info
# BB#0: # %co
subq $104, %rsp
movq 8(%rbp), %rax
movq %rax, 24(%rsp) # 8-byte Spill
movq %rbx, 8(%rax)
mfence
movq $stg_BLACKHOLE_info, (%rax)
movq %rax, %rcx
andq $-1048576, %rcx # imm = 0xFFFFFFFFFFF00000
andq $1044480, %rax # imm = 0xFF000
shrq $6, %rax
addq %rcx, %rax
addq $16, %rbp
cmpw $0, 52(%rax)
movsd %xmm6, 88(%rsp) # 8-byte Spill
movsd %xmm5, 80(%rsp) # 8-byte Spill
movss %xmm4, 76(%rsp) # 4-byte Spill
movss %xmm3, 72(%rsp) # 4-byte Spill
movss %xmm2, 68(%rsp) # 4-byte Spill
movss %xmm1, 64(%rsp) # 4-byte Spill
movq %r9, 56(%rsp) # 8-byte Spill
movq %r8, 48(%rsp) # 8-byte Spill
movq %rdi, 40(%rsp) # 8-byte Spill
movq %rsi, 32(%rsp) # 8-byte Spill
je .LBB1_4
```
This has two main problems:
1. mfence instruction (write barrier) isn't required. (write-write barriers aren't required on x86)
1. LLVM backend is spilling a lot of stuff unnecessarily.
Both these I think are fairly easy fixes. LLVM is handling write barriers quite naively at the moment so 1. is easy. The spilling problem I think is related to a previous fix I made where I need to explicitly kill some of the stg registers if they aren't live across the call, otherwise LLVM rightly thinks they are live since I always pass the stg registers around (so live on entry and exit of every function unless I kill them).8.0.1https://gitlab.haskell.org/ghc/ghc/-/issues/4213LLVM: Add support for TNTC to LLVM compiler suite2019-07-07T19:00:05ZdtereiLLVM: Add support for TNTC to LLVM compiler suiteAt the moment we handle TNTC in the LLVM backend in two different ways:
Linux/Windows: We use the GNU As subsections feature to order sections. Works very nicely. Slight hack in that we create special section names that contain comments...At the moment we handle TNTC in the LLVM backend in two different ways:
Linux/Windows: We use the GNU As subsections feature to order sections. Works very nicely. Slight hack in that we create special section names that contain comments. (Asm injection)
Mac: Mac assembler doesn't support the GNU As subsections feature, so we post-process the assembly code produced by llc.
Both these methods (especially Mac) are hacks. It would be better to extend LLVM to support the TNTC feature.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 6.13 |
| Type | FeatureRequest |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"LLVM: Add support for TNTC to LLVM compiler suite","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"dterei"},"version":"6.13","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"FeatureRequest","description":"At the moment we handle TNTC in the LLVM backend in two different ways:\r\n\r\nLinux/Windows: We use the GNU As subsections feature to order sections. Works very nicely. Slight hack in that we create special section names that contain comments. (Asm injection)\r\n\r\nMac: Mac assembler doesn't support the GNU As subsections feature, so we post-process the assembly code produced by llc.\r\n\r\nBoth these methods (especially Mac) are hacks. It would be better to extend LLVM to support the TNTC feature.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1Ben GamariBen Gamarihttps://gitlab.haskell.org/ghc/ghc/-/issues/4211LLVM: Stack alignment on OSX2019-07-07T19:00:05ZdtereiLLVM: Stack alignment on OSXOn OSX the ABI requires that the stack is 16 byte aligned when making function calls. (Although this only really needs to be obeyed when making calls that will go through the dynamic linker, so FFI calls). Since the stack is 16 byte alig...On OSX the ABI requires that the stack is 16 byte aligned when making function calls. (Although this only really needs to be obeyed when making calls that will go through the dynamic linker, so FFI calls). Since the stack is 16 byte aligned at the site of the call, on entry to a function most compilers (both llvm and gcc) expect the stack to now be aligned to 16n - 4, since 4 bytes should have been pushed for the return address as part of the call instruction. GHC though since it uses jumps everywhere keeps that stack at 16 byte aligned on function entrance. This means that LLVM generates incorrect stack alignment code, always off by 4.
For the moment I have handled this by using the LLvm Mangler (which is only needed on OS X already for TNTC) to simply correctly fix up the stack alignment code.
E.g Asm generated by LLVM:
```
_func:
subl $12, %esp
...
call _sin
...
addl $12, %esp
```
The mangler will change this to:
```
_func:
subl $16, %esp
...
call _sin
...
addl $16, %esp
```
The better solution would be to change GHC to keep the stack at 16n - 4 alignment on function. This will require changing the RTS (StgCRun.hs) to set the stack properly before calling into Stg land and also fixing up the NCG to align code properly. There may also be a problem with the C backend as currently all function prolouge and epilouge code is stripped out, which means all the stack manipulation code generated by GCC is removed. This works fine now since the stack is already 16 byte aligned on entry, but if it is now 16n - 4 byte aligned then some stack manipulation will be required.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | --------------- |
| Version | 6.13 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler (LLVM) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"LLVM: Stack alignment on OSX","status":"New","operating_system":"","component":"Compiler (LLVM)","related":[],"milestone":"","resolution":"Unresolved","owner":{"tag":"OwnedBy","contents":"dterei"},"version":"6.13","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Task","description":"On OSX the ABI requires that the stack is 16 byte aligned when making function calls. (Although this only really needs to be obeyed when making calls that will go through the dynamic linker, so FFI calls). Since the stack is 16 byte aligned at the site of the call, on entry to a function most compilers (both llvm and gcc) expect the stack to now be aligned to 16n - 4, since 4 bytes should have been pushed for the return address as part of the call instruction. GHC though since it uses jumps everywhere keeps that stack at 16 byte aligned on function entrance. This means that LLVM generates incorrect stack alignment code, always off by 4.\r\n\r\nFor the moment I have handled this by using the LLvm Mangler (which is only needed on OS X already for TNTC) to simply correctly fix up the stack alignment code.\r\n\r\nE.g Asm generated by LLVM: \r\n{{{\r\n_func:\r\n subl $12, %esp\r\n ...\r\n call _sin\r\n ...\r\n addl $12, %esp\r\n}}}\r\n\r\nThe mangler will change this to:\r\n{{{\r\n_func:\r\n subl $16, %esp\r\n ...\r\n call _sin\r\n ...\r\n addl $16, %esp\r\n}}}\r\n\r\nThe better solution would be to change GHC to keep the stack at 16n - 4 alignment on function. This will require changing the RTS (StgCRun.hs) to set the stack properly before calling into Stg land and also fixing up the NCG to align code properly. There may also be a problem with the C backend as currently all function prolouge and epilouge code is stripped out, which means all the stack manipulation code generated by GCC is removed. This works fine now since the stack is already 16 byte aligned on entry, but if it is now 16n - 4 byte aligned then some stack manipulation will be required.","type_of_failure":"OtherFailure","blocking":[]} -->8.0.1dtereidterei