Async exceptions ignored and reraised by `catch` still update enclosing thunks

The following program produces a surprising result:

import Control.Concurrent
import Control.Exception
import System.IO.Unsafe

main :: IO ()
main = do
  let io_thunk = unsafePerformIO $
        catch (threadDelay 1000000 *> pure True)
              (\(_ :: ErrorCall) -> pure False)
  eval_thread <- forkIO (evaluate io_thunk *> pure ())
  threadDelay 500000
  killThread eval_thread
  print io_thunk

Intuitively, the use of catch should not affect the meaning of this program, as it only catches ErrorCall exceptions, which will never be raised. However, its presence does affect the program’s result, and quite substantially: it causes the program to terminate with an exception.

$ ghc A.hs
[1 of 2] Compiling Main             ( A.hs, A.o )
[2 of 2] Linking A
$ ./A
A: thread killed

This is quite surprising, as the same program with the catch removed simply prints True and exits successfully.¹

The cause of the bug

This behavior is an unfortunate consequence of the way catch and asynchronous exceptions interact. When an async exception is delivered to a thread, the RTS takes care to suspend the work of any thunks currently being evaluated as it unwinds the stack. However, as soon as the RTS encounters a CATCH_FRAME, the exception is effectively converted to a synchronous exception. From the perspective of the RTS, all such frames are interchangeable, as the catch# primop knows nothing about the Exception class and therefore always catches all exceptions; catch is implemented by simply examining the caught exception and reraising it if it is not of the desired type.

This implementation strategy for catch fundamentally prevents io_thunk from the above program from being suspended upon delivery of an async exception. When the RTS encounters the CATCH_FRAME, it must resume execution from that point, so the unwound portion of the current evaluation context can only be discarded.

Workarounds

Technically, this bug can be avoided by simply uninterruptibly masking asynchronous exceptions whenever catch is used within unsafePerformIO. However, this is a very heavy hammer, and it is not particularly satisfying.

Given sufficiently deep knowledge of both the RTS and the implementation of catch, it is technically possible to implement a more complete workaround. The following variant of the above program behaves as desired:

import Control.Concurrent
import Control.Exception
import System.IO.Unsafe

main :: IO ()
main = do
  let io_thunk = unsafePerformIO $ mask $ \restore -> do
        let body = unsafePerformIO $ threadDelay 1000000 *> pure True
            go = restore $ catches (evaluate body)
                 [ Handler $ \(_ :: ErrorCall) -> pure False
                 , Handler $ \(exn :: SomeAsyncException) -> do
                     myself <- myThreadId
                     throwTo myself exn
                     go
                 ]
        go
  eval_thread <- forkIO (evaluate io_thunk *> pure ())
  threadDelay 500000
  killThread eval_thread
  print io_thunk

The details are remarkably subtle:

The body of the catch must be split into a separate thunk to give the RTS a place to store the suspended evaluation context.
When an async exception is raised, it must be explicitly caught and reraised using throwTo, which effectively suspends the outer thunk. When that thunk is resumed, it must continue by resuming evaluation of the inner thunk.
mask must be used not because async exceptions must be masked, but because they must be unmasked: catch implicitly masks exceptions within each handler, so if exceptions were not explicitly unmasked, they would incorrectly remain masked during the recursive call to go.

In theory, this is not a perfect solution: catching SomeAsyncException is not actually guaranteed to catch all async exceptions. However, it is likely good enough in practice. Still, it is quite complex, and I expect few Haskell programmers would fully understand how it works or why/when it is needed. At the very least, the Control.Exception documentation ought to be clarified to warn about this pitfall.

As of this writing, this difference can only be observed if the program is compiled. If evaluated in GHCi, the program always terminates with an exception whether the catch is present or not, which I’ve reported separately as #24187. ↩

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information