Internal RTS error when profiling a `-threaded` build
Summary
While trying to profile Clash I experienced sporadic SIGABRT
signals being thrown while trying to compile simple examples. Some investigation with GDB suggests to me that this may be a GHC bug, as building with -g -debug
shows internal errors in RTS for both GHC 8.10.7 and 9.0.1:
8.10.7:
Click to expand
Starting program: /home/axm/Documents/clash-compiler/dist-newstyle/build/x86_64-linux/ghc-8.10.7/clash-ghc-
1.5.0/x/clash/build/clash/clash --verilog -fclash-clear examples/CHIP8.hs -iclash-lib/prims/common -iclash-
lib/prims/commonverilog -iclash-lib/prims/verilog +RTS -N -hc
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff63a0700 (LWP 1362106)]
[New Thread 0x7ffff5b9f700 (LWP 1362107)]
[New Thread 0x7ffff539e700 (LWP 1362108)]
[New Thread 0x7ffff4b9d700 (LWP 1362109)]
[New Thread 0x7fffeffff700 (LWP 1362110)]
[New Thread 0x7fffef7fe700 (LWP 1362111)]
[New Thread 0x7fffeeffd700 (LWP 1362112)]
[New Thread 0x7fffee7fc700 (LWP 1362113)]
[New Thread 0x7fffedffb700 (LWP 1362114)]
[New Thread 0x7fffed7fa700 (LWP 1362115)]
[New Thread 0x7fffecff9700 (LWP 1362116)]
[New Thread 0x7fffd3fff700 (LWP 1362117)]
[New Thread 0x7fffd37fe700 (LWP 1362118)]
[New Thread 0x7fffd27fc700 (LWP 1362120)]
[New Thread 0x7fffd2ffd700 (LWP 1362119)]
[New Thread 0x7fffd1ffb700 (LWP 1362121)]
[New Thread 0x7fffd17fa700 (LWP 1362122)]
[New Thread 0x7fffd0ff9700 (LWP 1362123)]
[Detaching after vfork from child process 1362124]
[Detaching after vfork from child process 1362130]
[Detaching after vfork from child process 1362136]
[Detaching after vfork from child process 1362137]
[Detaching after vfork from child process 1362138]
[Detaching after vfork from child process 1362139]
[Detaching after vfork from child process 1362140]
[Detaching after vfork from child process 1362141]
[Detaching after vfork from child process 1362142]
[Detaching after vfork from child process 1362143]
[Detaching after vfork from child process 1362144]
[Detaching after vfork from child process 1362145]
clash: internal error: heapCensus, unknown object: 1107375296
(GHC version 8.10.7 for x86_64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
Thread 1 "clash" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7ce4859 in __GI_abort () at abort.c:79
#2 0x000000000a589dc7 in rtsFatalInternalErrorFn (s=0xb351338 "heapCensus, unknown object: %d", ap=0x7fffffffc6c8) at rts/RtsMessages.c:186
#3 0x000000000a589a7e in barf (s=0xb351338 "heapCensus, unknown object: %d") at rts/RtsMessages.c:48
#4 0x000000000a5a9c8d in heapCensusChain (census=0xc275da0, bd=0x4206600fc0) at rts/ProfHeap.c:1173
#5 0x000000000a5a9dd1 in heapCensus (t=23772450710) at rts/ProfHeap.c:1214
#6 0x000000000a5c1e81 in GarbageCollect (collect_gen=1, do_heap_census=true, deadlock_detect=false, gc_type=2, cap=0xc1d7b10, idle_cap=0x19f3cc20) at rts/sm/GC.c:917
#7 0x000000000a5a5766 in scheduleDoGC (pcap=0x7fffffffcca0, task=0xc27b0e0, force_major=false, deadlock_detect=false) at rts/Schedule.c:1849
#8 0x000000000a5a3a87 in schedule (initialCapability=0xc219350, task=0xc27b0e0) at rts/Schedule.c:564
#9 0x000000000a5a688b in scheduleWaitThread (tso=0x4200505370, ret=0x0, pcap=0x7fffffffcdc0) at rts/Schedule.c:2609
#10 0x000000000a5ac909 in rts_evalLazyIO (cap=0x7fffffffcdc0, p=0xb386a60, ret=0x0) at rts/RtsAPI.c:530
#11 0x000000000a5b105c in hs_main (argc=7, argv=0x7fffffffcfc8, main_closure=0xb386a60, rts_config=...) at rts/RtsMain.c:72
#12 0x000000000041215f in main ()
9.0.1:
Click to expand
Starting program: /home/axm/Documents/clash-compiler/dist-newstyle/build/x86_64-linux/ghc-9.0.1/clash-ghc-1.5.0/x/clash/build/clash/clash --verilog -fclash-clear examples/CHIP8.hs -iclash-lib/prims/common -iclash-lib/prims/commonverilog -iclash-lib/prims/verilog +RTS -N -hc
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff63a0700 (LWP 1366154)]
[New Thread 0x7ffff5b9f700 (LWP 1366155)]
[New Thread 0x7ffff539e700 (LWP 1366156)]
[New Thread 0x7ffff4b9d700 (LWP 1366157)]
[New Thread 0x7fffeffff700 (LWP 1366158)]
[New Thread 0x7fffef7fe700 (LWP 1366159)]
[New Thread 0x7fffeeffd700 (LWP 1366160)]
[New Thread 0x7fffee7fc700 (LWP 1366161)]
[New Thread 0x7fffedffb700 (LWP 1366162)]
[New Thread 0x7fffed7fa700 (LWP 1366163)]
[New Thread 0x7fffd3fff700 (LWP 1366165)]
[New Thread 0x7fffd27fc700 (LWP 1366168)]
[New Thread 0x7fffd2ffd700 (LWP 1366167)]
[New Thread 0x7fffd17fa700 (LWP 1366170)]
[New Thread 0x7fffd1ffb700 (LWP 1366169)]
[New Thread 0x7fffd37fe700 (LWP 1366166)]
[New Thread 0x7fffecff9700 (LWP 1366164)]
[New Thread 0x7fffd0ff9700 (LWP 1366171)]
[Detaching after vfork from child process 1366172]
[Detaching after vfork from child process 1366178]
[Detaching after vfork from child process 1366184]
[Detaching after vfork from child process 1366185]
[Detaching after vfork from child process 1366186]
[Detaching after vfork from child process 1366187]
[Detaching after vfork from child process 1366188]
[Detaching after vfork from child process 1366189]
[Detaching after vfork from child process 1366190]
[Detaching after vfork from child process 1366191]
[Detaching after vfork from child process 1366192]
[Detaching after vfork from child process 1366193]
clash: internal error: ASSERTION FAILED: file rts/CheckUnload.c, line 473
(GHC version 9.0.1 for x86_64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
Thread 1 "clash" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7ce4859 in __GI_abort () at abort.c:79
#2 0x0000000009db3ee1 in rtsFatalInternalErrorFn (s=0xac18bf0 "ASSERTION FAILED: file %s, line %u\n", ap=0x7fffffffc738) at rts/RtsMessages.c:186
#3 0x0000000009db3b98 in barf (s=0xac18bf0 "ASSERTION FAILED: file %s, line %u\n") at rts/RtsMessages.c:48
#4 0x0000000009db3bca in _assertFail (filename=0xac22776 "rts/CheckUnload.c", linenum=473) at rts/RtsMessages.c:63
#5 0x0000000009ddd053 in checkUnload () at rts/CheckUnload.c:473
#6 0x0000000009dec981 in GarbageCollect (collect_gen=1, do_heap_census=true, deadlock_detect=false, gc_type=2, cap=0xba71af0, idle_cap=0x16cd5c60) at rts/sm/GC.c:871
#7 0x0000000009dd0123 in scheduleDoGC (pcap=0x7fffffffccc0, task=0xbb15160, force_major=false, deadlock_detect=false) at rts/Schedule.c:1888
#8 0x0000000009dce40e in schedule (initialCapability=0xbae4560, task=0xbb15160) at rts/Schedule.c:581
#9 0x0000000009dd124b in scheduleWaitThread (tso=0x420072e880, ret=0x0, pcap=0x7fffffffcde0) at rts/Schedule.c:2653
#10 0x0000000009dd745e in rts_evalLazyIO (cap=0x7fffffffcde0, p=0xac569b0, ret=0x0) at rts/RtsAPI.c:530
#11 0x0000000009ddbbc8 in hs_main (argc=7, argv=0x7fffffffcfe8, main_closure=0xac569b0, rts_config=...) at rts/RtsMain.c:72
#12 0x000000000041215f in main ()
Given the traces and a brief look at the code I suspected it may be a profiling issue only and does not depend on the threaded runtime. Unfortunately, I have not been able to reproduce this locally without the threaded runtime.
Steps to reproduce
-
Check out the master branch of Clash
-
Apply the following patch:
diff --git a/cabal.project b/cabal.project index 909b10771..f285e81f1 100644 --- a/cabal.project +++ b/cabal.project @@ -12,6 +12,7 @@ packages: ./clash-term write-ghc-environment-files: always +profiling: True -- index state, to go along with the cabal.project.freeze file. update the index -- state by running `cabal update` twice and looking at the index state it diff --git a/clash-ghc/clash-ghc.cabal b/clash-ghc/clash-ghc.cabal index 3e44456f5..4a4c3cb0b 100644 --- a/clash-ghc/clash-ghc.cabal +++ b/clash-ghc/clash-ghc.cabal @@ -68,7 +68,7 @@ flag use-ghc-paths executable clash Main-Is: src-ghc/Batch.hs Build-Depends: base, clash-ghc - GHC-Options: -Wall -Wcompat + GHC-Options: -Wall -Wcompat -threaded -rtsopts -g -debug if flag(dynamic) GHC-Options: -dynamic extra-libraries: pthread
-
cabal build clash-ghc
-
cabal exec -- gdb --args clash --verilog examples/CHIP8.hs -iclash-lib/prims/common -iclash-lib/prims/commonverilog -iclash-lib/prims/verilog +RTS -N -hc
-
run
(in the GDB shell just opened)
Expected behavior
The steps above should (hopefully) result in the same log as above for GHC 8.10.7 or 9.0.1.
Environment
- GHC version used: 8.10.7, 9.0.1 (installed with
ghcup
)
Optional:
- Operating System: Ubuntu 20.04
- System Architecture: x86_64