So I'm still puzzled by why I can access the memory in gdb but I think I found the culprit: elf_got.c:makeGot calls mprotect to remap the GOT region as PROT_READ after initializing the GOT. Commenting this out allows the program to get a bit further.
Now the program is instead failing with this assertion failure:
Thread 1 "ghc-iserv.bin" received signal SIGABRT, Aborted.__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:5151 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.(gdb) bt#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51#1 0x0000ffff94ffa8b4 in __GI_abort () at abort.c:79#2 0x0000ffff94ff2b44 in __assert_fail_base (fmt=0xffff950ee0c0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x1d9c538 "isInt64(32, addend)", file=file@entry=0x1d9c518 "rts/linker/elf_reloc_aarch64.c", line=line@entry=93, function=function@entry=0x1d9c810 <__PRETTY_FUNCTION__.9614> "encodeAddendAarch64") at assert.c:92#3 0x0000ffff94ff2bc4 in __GI___assert_fail (assertion=0x1d9c538 "isInt64(32, addend)", file=0x1d9c518 "rts/linker/elf_reloc_aarch64.c", line=93, function=0x1d9c810 <__PRETTY_FUNCTION__.9614> "encodeAddendAarch64") at assert.c:101#4 0x0000000001d3dc88 in encodeAddendAarch64 (section=0x2bba4608, rel=0xffff92bc6be8, addend=-281473104080896) at rts/linker/elf_reloc_aarch64.c:93#5 0x0000000001d3e728 in relocateObjectCodeAarch64 (oc=0x2bba4410) at rts/linker/elf_reloc_aarch64.c:328#6 0x0000000001d1bf18 in relocateObjectCode (oc=0x2bba4410) at rts/linker/elf_reloc.c:9#7 0x0000000001d1a9c0 in ocResolve_ELF (oc=0x2bba4410) at rts/linker/Elf.c:1801#8 0x0000000001cd8908 in ocTryLoad (oc=0x2bba4410) at rts/Linker.c:1608#9 0x0000000001cd89e0 in resolveObjs_ () at rts/Linker.c:1654#10 0x0000000001cd8a74 in resolveObjs () at rts/Linker.c:1673#11 0x0000000000f57fd8 in ghcizm8zi9zi0zi20190604_GHCiziObjLink_resolveObjs1_info$def ()Backtrace stopped: previous frame identical to this frame (corrupt stack?)(gdb) up 4(gdb) print/x addend$2 = 0xffff00006f9e1000
This is in the COMPAT_R_AARCH64_ADR_PREL_PG_HI21 case of elf_reloc_aarch64.c:encodeAddendAarch64.
I suspect what is happening here is that some bit of code (the RTS?) is getting mapped in low memory whereas the RTS is loading static libraries in high memory (e.g. 0x0000ffff00000000). Consequently we can't fit the final relocation result in 32-bits.
On x86-64 mmap accepts a MAP_32BIT flag which hints that the mapping should be placed in the lower 2GB; it appears that the motivation for this flag is to solve issues very similar to what I describe above. Unfortunately, this is apparently not provided on AArch64.
For what it's worth adding -dynamic to the GHC command-line (causing GHC to use the dynamically-linked iserv interpreter) runs successfully. It seems that DYNAMIC_BY_DEFAULT is currently forced off in config.mk.in for reasons that aren't at all clear and probably no longer relevant. We should fix this.
The second issue was that the PLT was mapped too far from the call-sites due to a naive use of mmap instead of mmapForLinker. Tracked as #16784 (closed).
I still suspect that the GOT should also be using mmapForLinker, but fixing the PLT seems to have fixed most of the issues.
My thought was to simply avoid the problem entirely and use DYNAMIC_BY_DEFAULT, at least until our linker is better. I have tested and results are much better with DYNAMIC_BY_DEFAULT.