GHC issueshttps://gitlab.haskell.org/ghc/ghc/-/issues2022-12-24T18:57:17Zhttps://gitlab.haskell.org/ghc/ghc/-/issues/13678Overhaul the linker2022-12-24T18:57:17ZMoritz AngermannOverhaul the linkerThe linker has gained support for aarch64/macho, and aarch64/elf, as well as improved arm/elf code, this code introduces separate mapping of sections into different pages, to allow for `W^X` protection. This is essentially a requirement ...The linker has gained support for aarch64/macho, and aarch64/elf, as well as improved arm/elf code, this code introduces separate mapping of sections into different pages, to allow for `W^X` protection. This is essentially a requirement on iOS, and I believe a sane default to have. To this extend the symbol extras logic needed to be split up into GOT and PLTs, where the PLTs are allocated alongside the sections, as
the architecture may not allow arbitrary jumps (arm/aarch64). The introduction aarch64/elf has also been
done through a pseudo interface so that the actual relocation logic can be contained within it's own files.
Ideally we'd improve the linker by migrating the existing (heavily \#ifdef'd) to the new separate section mapping (`W^X`) and pseudo interface approach. The separate section mapping logic has an inherent memory inefficiency (which might be improved a bit by aggregating similar sections into the same page), and might also cause a performance penalty due to separate mapping instead of a single file mmap.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ------------------ |
| Version | 8.3 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | lowest |
| Resolution | Unresolved |
| Component | Compiler (Linking) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Overhaul the linker","status":"New","operating_system":"","component":"Compiler (Linking)","related":[],"milestone":"⊥","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"8.3","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Bug","description":"The linker has gained support for aarch64/macho, and aarch64/elf, as well as improved arm/elf code, this code introduces separate mapping of sections into different pages, to allow for `W^X` protection. This is essentially a requirement on iOS, and I believe a sane default to have. To this extend the symbol extras logic needed to be split up into GOT and PLTs, where the PLTs are allocated alongside the sections, as\r\nthe architecture may not allow arbitrary jumps (arm/aarch64). The introduction aarch64/elf has also been\r\ndone through a pseudo interface so that the actual relocation logic can be contained within it's own files.\r\n\r\nIdeally we'd improve the linker by migrating the existing (heavily #ifdef'd) to the new separate section mapping (`W^X`) and pseudo interface approach. The separate section mapping logic has an inherent memory inefficiency (which might be improved a bit by aggregating similar sections into the same page), and might also cause a performance penalty due to separate mapping instead of a single file mmap.","type_of_failure":"OtherFailure","blocking":[]} -->⊥https://gitlab.haskell.org/ghc/ghc/-/issues/12486Investigate removing libGCC symbols from RtsSymbols.c2022-08-11T18:39:43ZTamar ChristinaInvestigate removing libGCC symbols from RtsSymbols.cThe RTS is currently re-exporting symbols from the C compiler runtime in `RtsSymbols.c`.
How this works is that the Runtime linker is declaring that it can provide these symbols because it's been linked against the C compiler library it...The RTS is currently re-exporting symbols from the C compiler runtime in `RtsSymbols.c`.
How this works is that the Runtime linker is declaring that it can provide these symbols because it's been linked against the C compiler library itself. In essence it's providing pointers to it's own symbol table.
This Is fine but has two downsides:
1) We have to keep adding symbols to export anytime the underlying C compiler changes things or someone needs a new symbol from the library.
2) User code that is linking explicitly against these libraries will probably generate a duplicate symbols error if it needs a symbol we have not yet exported but is in the same object file or dependency of a symbol we have exported.
One solution would be to add `libgcc_s` to the dependencies of `ghc-prim` which is the package that seems to require them.
This has two issues with it: `GCC_S` doesn't exist for llvm, so we need to somehow know which compiler we're compiling for.
Secondly on Windows `GCC_S` is an import library with a non-standard name. (.a instead of .dll.a) and we currently cannot recognize it as such. We'd try to load it as a normal archive and end up trying to execute ascii as code.
This task is to find a way to remove the need to export these symbols yet still work with both GCC and LLVM.
<details><summary>Trac metadata</summary>
| Trac field | Value |
| ---------------------- | ----------------------- |
| Version | 8.0.1 |
| Type | Task |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Runtime System (Linker) |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture | |
</details>
<!-- {"blocked_by":[],"summary":"Investigate removing libGCC symbols from RtsSymbols.c","status":"New","operating_system":"","component":"Runtime System (Linker)","related":[],"milestone":"⊥","resolution":"Unresolved","owner":{"tag":"Unowned"},"version":"8.0.1","keywords":[],"differentials":[],"test_case":"","architecture":"","cc":[""],"type":"Task","description":"The RTS is currently re-exporting symbols from the C compiler runtime in `RtsSymbols.c`.\r\n\r\nHow this works is that the Runtime linker is declaring that it can provide these symbols because it's been linked against the C compiler library itself. In essence it's providing pointers to it's own symbol table.\r\n\r\nThis Is fine but has two downsides:\r\n\r\n1) We have to keep adding symbols to export anytime the underlying C compiler changes things or someone needs a new symbol from the library.\r\n\r\n2) User code that is linking explicitly against these libraries will probably generate a duplicate symbols error if it needs a symbol we have not yet exported but is in the same object file or dependency of a symbol we have exported.\r\n\r\nOne solution would be to add `libgcc_s` to the dependencies of `ghc-prim` which is the package that seems to require them.\r\n\r\nThis has two issues with it: `GCC_S` doesn't exist for llvm, so we need to somehow know which compiler we're compiling for. \r\n\r\nSecondly on Windows `GCC_S` is an import library with a non-standard name. (.a instead of .dll.a) and we currently cannot recognize it as such. We'd try to load it as a normal archive and end up trying to execute ascii as code.\r\n\r\nThis task is to find a way to remove the need to export these symbols yet still work with both GCC and LLVM.","type_of_failure":"OtherFailure","blocking":[]} -->⊥https://gitlab.haskell.org/ghc/ghc/-/issues/7746Support loading/unloading profiled objects from a profiled executable2019-07-07T18:48:19ZEdward Z. YangSupport loading/unloading profiled objects from a profiled executableThis is closely related to #3360, but it is a bit less ambitious and should be possible to implement without too many extra changes to the byte code compiler and interpreter (e.g. we just have to teach the linker how to handle things). H...This is closely related to #3360, but it is a bit less ambitious and should be possible to implement without too many extra changes to the byte code compiler and interpreter (e.g. we just have to teach the linker how to handle things). Here is a simple test program off of 'plugins' to get working:
```
{-# LANGUAGE ScopedTypeVariables #-}
import System.Plugins.Make
import System.Plugins.Load
import Data.List
boot :: FilePath -> IO ()
boot path = do
r <- make path ["-prof"]
case r of
MakeSuccess _ p -> do
r' <- load p [] [] "result"
case r' of
LoadSuccess _ (v :: Int) -> print v
LoadFailure msg -> print msg
MakeFailure es -> putStrLn ("Failed: " ++ intercalate " " es)
main = do
boot "Foo.hs"
```
where Foo.hs is
```
module Foo where
result = 2 :: Int
```
Here are the things that, as far as I can tell, need to be handled:
- We should ensure consistency between the host and the object file being uploaded. For example, if you load an un-profiled object file into a profiled binary, GHC will eat all your puppies. A simple way to do this is look for a symbol (e.g. CC_LIST) which is only ever exported when something is profiled and barf it is encountered.
- This current code fails with `test: Foo.o: unknown symbol `CC_LIST'`, much the same way GHCi used to fail. This particular problem is (I think) that we don’t store CC_LIST and other externsymbols in our global hash table, so the linker thinks that they don’t exist, when they do. CC_LIST and friends should be special-cased or added to the table. I have a patch for that; it's pretty easy.
- We don’t run ctors which setup CC_LIST with all of the cost-centres from the loaded module; we need to teach the linker to do that. The relevant bug is #5435.
- We need to come up with some sensible way of unloading cost-centres from CC_LIST and friends; we could make CC_LIST doubly-linked and then just excise the cost-centre in a destructor, but freeing the actual allocated CostCentre is more difficult. For now, we might just live with the memory leak, but see wiki:"Commentary/ResourceLimits\#Memoryleaks" for a possible better implementation strategy. Whatever cleanup is done here should be registered as a destructor for the library. Maybe #8039 solves this problem.
- Tests!
But that’s it; everything else should work normally. Something similar should apply to ticky builds. Sans destructors, there is a good chance this shindig may already work for dynamically linked apps.⊥Edward Z. YangEdward Z. Yang