Skip to content

Segmentation fault in concurrent program

@WJWH has recently been working on a io-uring-based IO manager based on my bindings. However, in testing his branch he encountered a rather suspicious segmentation fault during GC. Specifically, using 8.10.1:

$ git clone https://github.com/bgamari/io-uring segfault
$ cd io-uring
$ cabal new-run -w ghc-8.10.1 segfault &
$ echo 'GET http://localhost:3000/' | nix run nixpkgs.vegeta -c vegeta attack -rate 8000 -body main.hs >log

Within seconds the executable will segmentation fault with something like the following:

0x000000000087084c in LOOKS_LIKE_INFO_PTR_NOT_NULL (p=12297829382473034410) at includes/rts/storage/ClosureMacros.h:244
244         return info->type != INVALID_OBJECT && info->type < N_CLOSURE_TYPES;
(gdb) bt
#0  0x000000000087084c in LOOKS_LIKE_INFO_PTR_NOT_NULL (p=12297829382473034410) at includes/rts/storage/ClosureMacros.h:244
#1  0x000000000087089b in LOOKS_LIKE_INFO_PTR (p=12297829382473034410) at includes/rts/storage/ClosureMacros.h:249
#2  0x00000000008708d3 in LOOKS_LIKE_CLOSURE_PTR (p=0x4200c17808) at includes/rts/storage/ClosureMacros.h:254
#3  0x0000000000871cdb in checkTSO (tso=0x4200da27e8) at rts/sm/Sanity.c:629
#4  0x0000000000871578 in checkClosure (p=0x4200da27e8) at rts/sm/Sanity.c:432
#5  0x0000000000871f70 in checkMutableList (mut_bd=0x4201203f80, gen=1) at rts/sm/Sanity.c:691
#6  0x000000000087202f in checkLocalMutableLists (cap_no=6) at rts/sm/Sanity.c:710
#7  0x000000000087205c in checkMutableLists () at rts/sm/Sanity.c:719
#8  0x000000000087254c in checkSanity (after_gc=true, major_gc=false) at rts/sm/Sanity.c:871
#9  0x0000000000863764 in GarbageCollect (collect_gen=0, do_heap_census=false, deadlock_detect=false, gc_type=2, cap=0xac80d0, idle_cap=0x7fff9c000f80) at rts/sm/GC.c:863
#10 0x00000000008489dc in scheduleDoGC (pcap=0x7fffa7ffee68, task=0x7fffcc000b70, force_major=false, deadlock_detect=false) at rts/Schedule.c:1853
#11 0x00000000008474ef in scheduleProcessInbox (pcap=0x7fffa7ffee68) at rts/Schedule.c:1009
#12 0x0000000000846d32 in scheduleFindWork (pcap=0x7fffa7ffee68) at rts/Schedule.c:627
#13 0x000000000084629f in schedule (initialCapability=0xac80d0, task=0x7fffcc000b70) at rts/Schedule.c:290
#14 0x0000000000849d0f in scheduleWorker (cap=0xac80d0, task=0x7fffcc000b70) at rts/Schedule.c:2617
#15 0x0000000000850c76 in workerStart (task=0x7fffcc000b70) at rts/Task.c:445
#16 0x00007ffff7f2eef7 in start_thread () from /nix/store/wx1vk75bpdr65g6xwxbj4rw0pk04v5j3-glibc-2.27/lib/libpthread.so.0
#17 0x00007ffff7dcc22f in clone () from /nix/store/wx1vk75bpdr65g6xwxbj4rw0pk04v5j3-glibc-2.27/lib/libc.so.6
(gdb) up
#1  0x000000000087089b in LOOKS_LIKE_INFO_PTR (p=12297829382473034410) at includes/rts/storage/ClosureMacros.h:249
249         return p && (IS_FORWARDING_PTR(p) || LOOKS_LIKE_INFO_PTR_NOT_NULL(p));
(gdb) 
#2  0x00000000008708d3 in LOOKS_LIKE_CLOSURE_PTR (p=0x4200c17808) at includes/rts/storage/ClosureMacros.h:254
254         return LOOKS_LIKE_INFO_PTR((StgWord)
(gdb) 
#3  0x0000000000871cdb in checkTSO (tso=0x4200da27e8) at rts/sm/Sanity.c:629
629         ASSERT(LOOKS_LIKE_CLOSURE_PTR(tso->global_link) &&
(gdb) print tso->global_link
$1 = (struct StgTSO_ *) 0x4200c17808
(gdb) x/8a 0x4200c17808
0x4200c17808:   0xaaaaaaaaaaaaaaaa      0xaaaaaaaaaaaaaaaa
0x4200c17818:   0xaaaaaaaaaaaaaaaa      0xaaaaaaaaaaaaaaaa
0x4200c17828:   0xaaaaaaaaaaaaaaaa      0xaaaaaaaaaaaaaaaa
0x4200c17838:   0xaaaaaaaaaaaaaaaa      0xaaaaaaaaaaaaaaaa

It appears as though we somehow end up with a TSO on the heap with an invalid global_link field.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information