shutdown_handler re-enter causes lock up
We develop a complex C++ program which invokes Haskell code via
FFI. We build with gcc-10 and ghc-8.10.7 using -threaded
and run on
Linux 5.19.0. The program sometimes locks up on SIGINT. It takes some
effort and maybe sending a couple of signals in a row but humans can
also Ctrl-C
fast enough. The problem can be avoided by setting
GHCRTS=--install-signal-handlers=no
. When the program locks up, I
can attach gdb and I see two threads locked up in signal handlers:
(gdb) attach 47138
Attaching to process 47138
[New LWP 47153]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "glibc-2.33-123/lib/libthread_db.so.1".
0x00007f99966ad37d in __lll_lock_wait_private () from glibc-2.33-123/lib/libc.so.6
(gdb) bt
#0 0x00007f99966ad37d in __lll_lock_wait_private () from glibc-2.33-123/lib/libc.so.6
#1 0x00007f99966b0c47 in _int_free () from glibc-2.33-123/lib/libc.so.6
#2 0x00007f99966b3ad4 in free () from glibc-2.33-123/lib/libc.so.6
#3 0x0000000001557f1e in <app-object>::CacheType::~CacheType() ()
#4 0x00007f999666975f in __call_tls_dtors () from glibc-2.33-123/lib/libc.so.6
#5 0x00007f9996668fee in __run_exit_handlers () from glibc-2.33-123/lib/libc.so.6
#6 0x00007f999666903a in exit () from glibc-2.33-123/lib/libc.so.6
#7 0x00007f99978db9cb in stg_exit () from rts/libHSrts_thr-ghc8.10.7.so
#8 0x00007f999790314c in shutdown_handler () from rts/libHSrts_thr-ghc8.10.7.so
#9 <signal handler called>
#10 0x00007f99966b23f3 in _int_malloc () from glibc-2.33-123/lib/libc.so.6
#11 0x00007f99966b35e8 in malloc () from glibc-2.33-123/lib/libc.so.6
#12 0x0000000002ad3845 in operator new(unsigned long) ()
...
#23 0x000000000057bb58 in main ()
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7f999786dac0 (LWP 47138) "app.u" 0x00007f99966ad37d in __lll_lock_wait_private ()
from glibc-2.33-123/lib/libc.so.6
2 Thread 0x7f99883fe640 (LWP 47153) "app:w" 0x00007f99966ad37d in __lll_lock_wait_private ()
from glibc-2.33-123/lib/libc.so.6
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f99883fe640 (LWP 47153))]
#0 0x00007f99966ad37d in __lll_lock_wait_private () from glibc-2.33-123/lib/libc.so.6
(gdb) bt
#0 0x00007f99966ad37d in __lll_lock_wait_private () from glibc-2.33-123/lib/libc.so.6
#1 0x00007f99966b0c47 in _int_free () from glibc-2.33-123/lib/libc.so.6
#2 0x00007f99966b3ad4 in free () from glibc-2.33-123/lib/libc.so.6
#3 0x00007f99978d0df2 in freeHashTable () from rts/libHSrts_thr-ghc8.10.7.so
#4 0x00007f99978cf7e7 in freeFileLocking () from rts/libHSrts_thr-ghc8.10.7.so
#5 0x00007f99978db85f in hs_exit_.part () from rts/libHSrts_thr-ghc8.10.7.so
#6 0x000000000191aad9 in app::GHC::~GHC() ()
#7 0x00007f9996668e87 in __run_exit_handlers () from glibc-2.33-123/lib/libc.so.6
#8 0x00007f999666903a in exit () from glibc-2.33-123/lib/libc.so.6
#9 0x00007f99978db9cb in stg_exit () from rts/libHSrts_thr-ghc8.10.7.so
#10 0x00007f999790314c in shutdown_handler () from rts/libHSrts_thr-ghc8.10.7.so
#11 <signal handler called>
#12 0x00007f999672735f in epoll_wait () from glibc-2.33-123/lib/libc.so.6
#13 0x00007f9997715b26 in base_GHCziEventziEPoll_new9_info ()
from base-4.14.3.0/libHSbase-4.14.3.0-ghc8.10.7.so
#14 0x0000000000000000 in ?? ()
Seeing two of these running concurrently inside exit
seems like the
cause of the hang. I doubt anybody expected exit
to
be called concurrently in two threads. I'm also surprised by seeing
<signal handler called>
in multiple threads. This contradicts
Signals in Handler:
When the handler for a particular signal is invoked, that signal is automatically blocked until the handler returns. That means that if two signals of the same kind arrive close together, the second one will be held until the first has been handled. (The handler can explicitly unblock the signal using sigprocmask, if you want to allow more signals of this type to arrive; see Process Signal Mask.)