Many test cases crash at runtime with nonmoving_sanity test way
How to reproduce: first, apply the following patch to a clean checkout (0962b50d as of last update):
diff --git a/testsuite/config/ghc b/testsuite/config/ghc
index 90b11bc57fc..5216366ddd7 100644
--- a/testsuite/config/ghc
+++ b/testsuite/config/ghc
@@ -2,6 +2,8 @@
import re
+from testglobals import default_testopts
+
# Testsuite configuration setup for GHC
#
# This file is Python source
@@ -29,10 +31,14 @@ config.other_ways = ['hpc',
'ghci-ext', 'ghci-ext-prof',
'ext-interp',
'nonmoving',
+ 'nonmoving_sanity',
'nonmoving_thr',
'nonmoving_thr_sanity',
'nonmoving_thr_ghc',
'compacting_gc',
+ 'compacting_gc_sanity',
+ 'sweeping_gc',
+ 'sweeping_gc_sanity'
]
@@ -69,10 +75,8 @@ if windows:
else:
config.other_ways += winio_ways
-# LLVM
-if not config.unregisterised and not config.arch in {"wasm32", "javascript"} and config.have_llvm:
- config.compile_ways.append('optllvm')
- config.run_ways.append('optllvm')
+if True:
+ default_testopts.extra_ways = ['compacting_gc', 'compacting_gc_sanity', 'nonmoving', 'nonmoving_sanity', 'nonmoving_thr', 'nonmoving_thr_sanity', 'sweeping_gc', 'sweeping_gc_sanity', 'sanity', 'threaded2_sanity']
# HPC
if not config.arch == "javascript":
@@ -122,10 +126,14 @@ config.way_flags = {
'ghci-ext-prof' : ['--interactive', '-v0', '-ignore-dot-ghci', '-fno-ghci-history', '-fexternal-interpreter', '-prof', '+RTS', '-I0.1', '-RTS'],
'ext-interp' : ['-fexternal-interpreter'],
'nonmoving' : [],
+ 'nonmoving_sanity' : ['-debug'],
'nonmoving_thr': ['-threaded'],
'nonmoving_thr_sanity': ['-threaded', '-debug'],
'nonmoving_thr_ghc': ['+RTS', '-xn', '-N2', '-RTS', '-threaded'],
'compacting_gc': [],
+ 'compacting_gc_sanity': ['-debug'],
+ 'sweeping_gc': [],
+ 'sweeping_gc_sanity': ['-debug'],
'winio': [],
'winio_threaded': ['-threaded'],
}
@@ -169,10 +177,14 @@ config.way_rts_flags = {
'ghci-ext-prof' : [],
'ext-interp' : [],
'nonmoving' : ['-xn'],
+ 'nonmoving_sanity' : ['-xn', '-DS'],
'nonmoving_thr' : ['-xn', '-N2'],
'nonmoving_thr_sanity' : ['-xn', '-N2', '-DS'],
'nonmoving_thr_ghc': ['-xn', '-N2'],
'compacting_gc': ['-c'],
+ 'compacting_gc_sanity': ['-c', '-DS'],
+ 'sweeping_gc': ['-w'],
+ 'sweeping_gc_sanity': ['-w', '-DS'],
'winio': ['--io-manager=native'],
'winio_threaded': ['--io-manager=native'],
}
The patch above adds a bunch of test ways that are forcibly executed regardless of test speed. Then, on x86_64-linux, in a clean debian 12 image and using ghc-9.8.2 to bootstrap, just build it and run the full testsuite as usual. There will be numerous segmentation faults or RTS internal errors for the nonmoving_sanity
test way, which tests single-threaded nonmoving GC combined with sanity check.
Full test log: https://gitlab.haskell.org/-/snippets/5750
Test cases failing with Segmentation fault (core dumped)
:
*** unexpected failure for cmp64(nonmoving_sanity)
*** unexpected failure for conc027(nonmoving_sanity)
*** unexpected failure for conc004(nonmoving_sanity)
*** unexpected failure for arith003(nonmoving_sanity)
*** unexpected failure for CarryOverflow(nonmoving_sanity)
*** unexpected failure for T3245(nonmoving_sanity)
*** unexpected failure for CmpInt16(nonmoving_sanity)
*** unexpected failure for CmpInt32(nonmoving_sanity)
*** unexpected failure for life_space_leak(nonmoving_sanity)
*** unexpected failure for thurston-modular-arith(nonmoving_sanity)
*** unexpected failure for T11108(nonmoving_sanity)
*** unexpected failure for T9532(nonmoving_sanity)
*** unexpected failure for inits1tails1(nonmoving_sanity)
*** unexpected failure for T20107(nonmoving_sanity)
*** unexpected failure for bytestringread001(nonmoving_sanity)
*** unexpected failure for AtomicSwapIORef(nonmoving_sanity)
*** unexpected failure for Timeout001(nonmoving_sanity)
*** unexpected failure for memo001(nonmoving_sanity)
*** unexpected failure for stack_underflow(nonmoving_sanity)
Test cases failing with internal error: checkClosure: found EVACUATED closure
:
*** unexpected failure for arr020(nonmoving_sanity)
*** unexpected failure for arr017(nonmoving_sanity)
*** unexpected failure for annrun01(nonmoving_sanity)
*** unexpected failure for conc051(nonmoving_sanity)
*** unexpected failure for throwto003(nonmoving_sanity)
*** unexpected failure for throwto002(nonmoving_sanity)
*** unexpected failure for tryReadMVar2(nonmoving_sanity)
*** unexpected failure for readMVar1(nonmoving_sanity)
*** unexpected failure for hs_try_putmvar002(nonmoving_sanity)
*** unexpected failure for LintEtaExpand(nonmoving_sanity)
*** unexpected failure for AtomicPrimops(nonmoving_sanity)
*** unexpected failure for T11535(nonmoving_sanity)
*** unexpected failure for T5313(nonmoving_sanity)
*** unexpected failure for ffi016(nonmoving_sanity)
*** unexpected failure for fptr02(nonmoving_sanity)
*** unexpected failure for T4221(nonmoving_sanity)
*** unexpected failure for ffi023(nonmoving_sanity)
*** unexpected failure for T10942(nonmoving_sanity)
*** unexpected failure for T9595(nonmoving_sanity)
*** unexpected failure for T18522-dbg-ppr(nonmoving_sanity)
*** unexpected failure for T18181(nonmoving_sanity)
*** unexpected failure for PartialDownsweep(nonmoving_sanity)
*** unexpected failure for OldModLocation(nonmoving_sanity)
*** unexpected failure for T10508_api(nonmoving_sanity)
*** unexpected failure for dynCompileExpr(nonmoving_sanity)
*** unexpected failure for T11938(nonmoving_sanity)
*** unexpected failure for TargetContents(nonmoving_sanity)
*** unexpected failure for T3372(nonmoving_sanity)
*** unexpected failure for PatTypes(nonmoving_sanity)
*** unexpected failure for HieVdq(nonmoving_sanity)
*** unexpected failure for RecordDotTypes(nonmoving_sanity)
*** unexpected failure for HieQueries(nonmoving_sanity)
*** unexpected failure for T23492(nonmoving_sanity)
*** unexpected failure for T20341(nonmoving_sanity)
*** unexpected failure for SpliceTypes(nonmoving_sanity)
*** unexpected failure for T23540(nonmoving_sanity)
*** unexpected failure for Monoid_ByteArray(nonmoving_sanity)
*** unexpected failure for arith008(nonmoving_sanity)
*** unexpected failure for arith011(nonmoving_sanity)
*** unexpected failure for T8726(nonmoving_sanity)
*** unexpected failure for foundation(nonmoving_sanity)
*** unexpected failure for space_leak_001(nonmoving_sanity)
*** unexpected failure for static-plugins(nonmoving_sanity)
*** unexpected failure for T4334(nonmoving_sanity)
*** unexpected failure for ArithWord16(nonmoving_sanity)
*** unexpected failure for CmpWord16(nonmoving_sanity)
*** unexpected failure for ArithInt16(nonmoving_sanity)
*** unexpected failure for ArithWord32(nonmoving_sanity)
*** unexpected failure for ShrinkSmallMutableArrayB(nonmoving_sanity)
*** unexpected failure for ArithInt32(nonmoving_sanity)
*** unexpected failure for CmpWord32(nonmoving_sanity)
*** unexpected failure for CmpInt8(nonmoving_sanity)
*** unexpected failure for CmpWord8(nonmoving_sanity)
*** unexpected failure for ArithInt8(nonmoving_sanity)
*** unexpected failure for ArithWord8(nonmoving_sanity)
*** unexpected failure for T23071(nonmoving_sanity)
*** unexpected failure for T22488_docHead(nonmoving_sanity)
*** unexpected failure for andre_monad(nonmoving_sanity)
*** unexpected failure for queens(nonmoving_sanity)
*** unexpected failure for 10queens(nonmoving_sanity)
*** unexpected failure for jules_xref(nonmoving_sanity)
*** unexpected failure for jules_xref2(nonmoving_sanity)
*** unexpected failure for jl_defaults(nonmoving_sanity)
*** unexpected failure for andy_cherry(nonmoving_sanity)
*** unexpected failure for barton-mangler-bug(nonmoving_sanity)
*** unexpected failure for seward-space-leak(nonmoving_sanity)
*** unexpected failure for galois_raytrace(nonmoving_sanity)
*** unexpected failure for bug1010(nonmoving_sanity)
*** unexpected failure for T7160(nonmoving_sanity)
*** unexpected failure for stack003(nonmoving_sanity)
*** unexpected failure for stablename001(nonmoving_sanity)
*** unexpected failure for T7636(nonmoving_sanity)
*** unexpected failure for T10904(nonmoving_sanity)
*** unexpected failure for alloccounter1(nonmoving_sanity)
*** unexpected failure for T19381(nonmoving_sanity)
*** unexpected failure for cloneMyStack_retBigStackFrame(nonmoving_sanity)
*** unexpected failure for decodeMyStack_underflowFrames(nonmoving_sanity)
*** unexpected failure for T23400(nonmoving_sanity)
*** unexpected failure for T17574(nonmoving_sanity)
*** unexpected failure for T19481(nonmoving_sanity)
*** unexpected failure for T9646(nonmoving_sanity)
*** unexpected failure for IOManager(nonmoving_sanity)
*** unexpected failure for unicode002(nonmoving_sanity)
*** unexpected failure for inits(nonmoving_sanity)
*** unexpected failure for length001(nonmoving_sanity)
*** unexpected failure for unicode003(nonmoving_sanity)
*** unexpected failure for tup001(nonmoving_sanity)
*** unexpected failure for memo002(nonmoving_sanity)
*** unexpected failure for stableptr001(nonmoving_sanity)
*** unexpected failure for dynamic003(nonmoving_sanity)
*** unexpected failure for stack_misc_closures(nonmoving_sanity)
*** unexpected failure for stableptr003(nonmoving_sanity)
*** unexpected failure for weak001(nonmoving_sanity)
*** unexpected failure for stableptr004(nonmoving_sanity)
*** unexpected failure for dynamic004(nonmoving_sanity)
*** unexpected failure for ioref001(nonmoving_sanity)
*** unexpected failure for WasmControlFlow(nonmoving_sanity)
*** unexpected failure for T7653(nonmoving_sanity)
*** unexpected failure for T13167(nonmoving_sanity)
*** unexpected failure for AtomicModifyIORef(nonmoving_sanity)
*** unexpected failure for finalization001(nonmoving_sanity)
*** unexpected failure for qsem001(nonmoving_sanity)
*** unexpected failure for qsemn001(nonmoving_sanity)
*** unexpected failure for Chan003(nonmoving_sanity)
*** unexpected failure for hGetBuf001(nonmoving_sanity)
*** unexpected failure for encoding001(nonmoving_sanity)
*** unexpected failure for T2411(nonmoving_sanity)
*** unexpected failure for stm050(nonmoving_sanity)
*** unexpected failure for T15136(nonmoving_sanity)
*** unexpected failure for encoding004(nonmoving_sanity)
*** unexpected failure for T16707(nonmoving_sanity)
Test cases failing with internal error: ASSERTION FAILED: file rts/sm/Sanity.c, line 571
:
*** unexpected failure for T23120(nonmoving_sanity)
Test cases failing with internal error: checkClosure: stack frame
:
*** unexpected failure for openFile008(nonmoving_sanity)
For the most common failure pattern, the full test log offers stack trace since I configured with --enable-dwarf-unwind
and built with perf+debug_info
flavour:
arith008: internal error: checkClosure: found EVACUATED closure 1107316858
Stack trace:
0x2b960d set_initial_registers (rts/Libdw.c:297.5)
0x7ff60031fc38 dwfl_thread_getframes (/usr/lib/x86_64-linux-gnu/libdw-0.188.so)
0x7ff60031ffec dwfl_getthread_frames (/usr/lib/x86_64-linux-gnu/libdw-0.188.so)
0x2b94ef libdwGetBacktrace (rts/Libdw.c:263.15)
0x2bf253 rtsFatalInternalErrorFn (rts/RtsMessages.c:175.22)
0x2bee8d barf (rts/RtsMessages.c:49.3)
0x307473 checkClosure (rts/sm/Sanity.c:367.12)
0x307d31 checkHeapChain (rts/sm/Sanity.c:601.26)
0x308c6f checkGeneration (rts/sm/Sanity.c:1015.9)
0x308d1a checkFullHeap (rts/sm/Sanity.c:1034.52)
0x308d94 checkSanity (rts/sm/Sanity.c:1046.5)
0x2f1cc0 GarbageCollect (rts/sm/GC.c:461.3)
0x2d6c12 scheduleDoGC (rts/Schedule.c:1915.9)
0x2d780e exitScheduler (rts/Schedule.c:2801.9)
0x2bfb4f hs_exit_ (rts/RtsStartup.c:490.12)
0x2bfd39 shutdownHaskellAndExit (rts/RtsStartup.c:678.5)
0x2bd89f ASSIGN_FLT (rts/include/Stg.h:426.62)
0x23d901 (null) (/tmp/ghctest-kphe85yc/test spaces/testsuite/tests/numeric/should_run/arith008.run/arith008)
0x7ff60039d24a __libc_start_call_main (../sysdeps/nptl/libc_start_call_main.h:74.3)
0x7ff60039d305 __libc_start_main@@GLIBC_2.34 (../csu/libc-start.c:128.20)
0x23c021 _start (/tmp/ghctest-kphe85yc/test spaces/testsuite/tests/numeric/should_run/arith008.run/arith008)
(GHC version 9.11.20240511 for x86_64_unknown_linux)
Please report this as a GHC bug: https://www.haskell.org/ghc/reportabug
Aborted (core dumped)
*** unexpected failure for arith008(nonmoving_sanity)
Similar issues were first discovered when I was attempting to test wasm with more test ways, and they also exists on i386. It seems like something's wrong in sanity checking logic with single-threaded nonmoving GC.