Tag GHC-mangled symbol names with a prefix (eg. `_H`)
Object files produced by GHC contain symbols mangled with GHC's Z-encoding, but there's no easy way to tell whether a symbol has been Z-encoded, or not.
Some languages/compilers are making this a bit easier, by tagging their mangled symbols with a prefix.
This, of course, isn't perfect, because you can always construct a symbol which happens to have one of these prefixes, from your C file.
For the purposes of this issue, I'll assume C++ means GCC or Clang (as other mangling schemes have existed).
language | prefix |
---|---|
C++ | Just "_Z" |
Rust | Just "_R" |
Dlang | Just "_D" |
Ada | Nothing (although maybe "_ada_" for "library level subprograms" |
So, how do tools which support demangling (readelf/nm/gdb?) handle this?
Well GNU binutils lets you specify demangle={auto|<language>}
, so we can currently add a Haskell demangler to binutils which works when language=haskell
, but not for auto
(as that would be too fragile, decoding symbols that weren't produced by GHC). I've started the process of committing this kind of demangler upstream.
LLVM's binutils on the other hand, only support automatic detection of mangling scheme (which checks symbols for the above prefixes). Arguably LLVM binutils should let you specify a language explicitly, but you have to admit that sometimes you do want mangling scheme autodetection, as you might have symbols from different languages in your object code.
I think it would be an improvement if GHC produced mangled symbol names with a prefix, (let's face it, probably _H
).
We could then add support for demangle=auto
to GNU's binutils, and add a demangler to LLVM's binutils (which I guess are the default binutils on MacOS?)