GHC 8.10.x generates segfaulting code when cross compiling for ARM
Dear GHC maintainers,
I'd appreciate any help I could get with the following issue -- thank you for your time!
When cross compiling for ARM, on GHC 8.10.1 the binaries seem to segfault almost immediately. I've tried using both glibc and musl-based toolchains (including GHCs), as well as compiling with debug information enabled:
cabal v2-build --allow-newer --with-ghc=armv7a-hardfloat-linux-musleabi-ghc --with-hc-pkg=armv7a-hardfloat-linux-musleabi-ghc-pkg --ghc-options=-O0 --enable-debug-info --disable-executable-stripping --ghc-options=-debug --ghc-options=-rtsopts --ghc-options=-g --ghc-options=-dcore-lint --ghc-options=-threaded --enable-executable-static
I've tried invoking my binaries with all of the
+RTS -D* options, as well as
-V0 -- no luck.
I've also tried enabling/disabling
-threaded and using the new GC with
-xn -- no luck.
Stepping through everything with
gdb, it seems that the failure occurs in
stg_gc_noregs -- I was able
to breakpoint this using
heapCheckFail and it seems that an error code is being returned out of this function.
Going into the GHC source code, I tracked it down to
HeapStackCheck.cmm, and it appears that
ret is being set to
before going back to the scheduler, which according to
includes/rts/Constants.h indicates that a heap overflow has occurred. Shortly after this, it segfaults, although sometimes it will generate an Illegal Instruction signal instead.
My best guess: there's some kind of heap or stack corruption going on here, causing a return somewhere to jump into non-code.
bt hasn't been very useful either at the time of the failure.
I have modified my source code (the XSaiga package on GitHub -- the Hackage version is out of date) to try to isolate where the problematic code is. It seems that I can successfully compile "Hello world", including with the Text package (strict), but going beyond this is resulting in these problems.
I am able to reproduce this bug with GHC 8.10.1 with both glibc and musl based toolchains. My host GHC seems to be working fine. Notably, GHC 8.8.3 does not seem to suffer from this bug and is working as intended.
I have also tried swapping in newer LLVM versions, but these seem to have had no effect. I also tried both dynamic and static linking, but this also had no effect.
Steps to reproduce
Unfortunately, I'm unable to construct a minimal working example to demonstrate this problem.
However, my GitHub on the
solarman-agparser branch is
able to demonstrate the problem readily. Just check out the code, cross compile it for ARM,
and run it -- you will see the segfault very quickly.
It seems that it segfaults before it even gets a chance to run any of the code in the Haskell
I wish I could produce a better MWE for you -- if I'm able to do that I will update this bug accordingly.
The binary should work as it did with GHC 8.8.3 (which works very well I might add, kudos for that).
- GHC version used: 8.10.1
- Operating System: Gentoo (host), OpenWRT (target)
- System Architecture: GCC 9.3.0, LLVM 9, cross compiling for ARM