GHC 8.10.x generates segfaulting code when cross compiling for ARM
Dear GHC maintainers,
I'd appreciate any help I could get with the following issue -- thank you for your time!
Summary
When cross compiling for ARM, on GHC 8.10.1 the binaries seem to segfault almost immediately. I've tried using both glibc and musl-based toolchains (including GHCs), as well as compiling with debug information enabled:
cabal v2-build --allow-newer --with-ghc=armv7a-hardfloat-linux-musleabi-ghc --with-hc-pkg=armv7a-hardfloat-linux-musleabi-ghc-pkg --ghc-options=-O0 --enable-debug-info --disable-executable-stripping --ghc-options=-debug --ghc-options=-rtsopts --ghc-options=-g --ghc-options=-dcore-lint --ghc-options=-threaded --enable-executable-static
I've tried invoking my binaries with all of the +RTS -D*
options, as well as -I0
and -V0
-- no luck.
I've also tried enabling/disabling -threaded
and using the new GC with -xn
-- no luck.
Stepping through everything with gdb
, it seems that the failure occurs in stg_gc_noregs
-- I was able
to breakpoint this using heapCheckFail
and it seems that an error code is being returned out of this function.
Going into the GHC source code, I tracked it down to HeapStackCheck.cmm
, and it appears that ret
is being set to 1
before going back to the scheduler, which according to includes/rts/Constants.h
indicates that a heap overflow has occurred. Shortly after this, it segfaults, although sometimes it will generate an Illegal Instruction signal instead.
My best guess: there's some kind of heap or stack corruption going on here, causing a return somewhere to jump into non-code. bt
hasn't been very useful either at the time of the failure.
I have modified my source code (the XSaiga package on GitHub -- the Hackage version is out of date) to try to isolate where the problematic code is. It seems that I can successfully compile "Hello world", including with the Text package (strict), but going beyond this is resulting in these problems.
I am able to reproduce this bug with GHC 8.10.1 with both glibc and musl based toolchains. My host GHC seems to be working fine. Notably, GHC 8.8.3 does not seem to suffer from this bug and is working as intended.
I have also tried swapping in newer LLVM versions, but these seem to have had no effect. I also tried both dynamic and static linking, but this also had no effect.
Steps to reproduce
Unfortunately, I'm unable to construct a minimal working example to demonstrate this problem.
However, my GitHub on the solarman-agparser
branch is
able to demonstrate the problem readily. Just check out the code, cross compile it for ARM,
and run it -- you will see the segfault very quickly.
It seems that it segfaults before it even gets a chance to run any of the code in the Haskell main
function.
I wish I could produce a better MWE for you -- if I'm able to do that I will update this bug accordingly.
Expected behavior
The binary should work as it did with GHC 8.8.3 (which works very well I might add, kudos for that).
Environment
- GHC version used: 8.10.1
Optional:
- Operating System: Gentoo (host), OpenWRT (target)
- System Architecture: GCC 9.3.0, LLVM 9, cross compiling for ARM