Application locks up after getting `epollControl: does not exist` error

Summary

A multi-threaded program can cause an exception (epollControl: does not exist (No such file or directory)) inside the IO manager that leaves it in a broken state.

It appears to be a race condition occurring when threadWaitRead and closeFdWith are called simultaneously from different threads (I am not saying that this is a particularly good idea to do in the first place - but it can happen accidentally such as due to this bug in http-client).

Despite the original issue being caused by a third-party library, I still opted to open this issue as well because it is not obvious from the documentation that this is an expected failure mode. The closeFdWith documentation even explicitly states

Close a file descriptor in a concurrency-safe way

which clearly is not the case.

Steps to reproduce

I've managed to boil it down to the following reproducer depending on just the network and bytestring packages in addition to base.

What it does is:

Start a local TCP server in a new thread
From the main thread, repeatedly connect to and read from said server, while concurrently calling close on the socket.

I ran it with stack run, so it's compiled with optimizations enabled. Additionally, I passed the -N RTS flag to enable actual multi-threading. EDIT: It appears that with -N2 the reproducer locks up even faster (ran on a 12 core machine).

{-# LANGUAGE NumericUnderscores #-}
module Main where

import Control.Concurrent (forkIO, forkFinally, threadDelay)
import Control.Concurrent.MVar (MVar, newMVar, newEmptyMVar, putMVar, takeMVar)
import Control.Exception (SomeException, SomeAsyncException, Exception (..), catch, throwIO)
import Control.Monad (unless, forever)
import GHC.Conc.IO (closeFdWith)

import qualified Data.ByteString as BS
import qualified Network.Socket as Sock
import qualified Network.Socket.ByteString as SockBS

main :: IO ()
main = do
  serverReady <- newEmptyMVar
  forkIO (server serverReady `catch` \e -> putStrLn $ "Caught in server " <> show (e :: SomeException))
  takeMVar serverReady

  putStrLn "Local server ready"
  forever $ do
    let
      handler e
        | Just as <- fromException e = throwIO (as :: SomeAsyncException)
        | otherwise = putStrLn $ "Caught in client " <> show e

    reader `catch` handler

reader :: IO ()
reader = do
  putStrLn "Connecting reader"
  sock <- Sock.socket Sock.AF_INET Sock.Stream Sock.defaultProtocol
  Sock.connect sock (Sock.SockAddrInet 9999 $ Sock.tupleToHostAddress (127, 0, 0, 1))

  let
    drain = do
      msg <- SockBS.recv sock 1024
      unless (BS.null msg) drain

  readingBarrier <- newEmptyMVar
  forkIO $ do
    takeMVar readingBarrier
    Sock.close sock `catch` \e -> putStrLn $ "Caught in closer: " <> show (e :: SomeException)

  putMVar readingBarrier ()
  drain

server :: MVar () -> IO ()
server ready = do
  listenSock <- Sock.socket Sock.AF_INET Sock.Stream Sock.defaultProtocol
  Sock.bind listenSock (Sock.SockAddrInet 9999 $ Sock.tupleToHostAddress (127, 0, 0, 1))
  Sock.listen listenSock 5

  putMVar ready ()

  forever $ do
    (clientSock, _) <- Sock.accept listenSock
    forkFinally (handleClient clientSock) (\_ -> Sock.close clientSock)

handleClient :: Sock.Socket -> IO ()
handleClient clientSock = do
  let
    loop = do
      SockBS.send clientSock $ BS.replicate 1024 62
      -- This delay seems to be needed for the reproducer - presumably as otherwise there's always
      -- data ready at the receiving end and it never needs to suspend
      threadDelay 10_000
      loop

  loop

Expected behavior

In terms of the reproducer, the expected behavior is that it just indefinitely prints Connecting reader and then a variation of Caught in client threadWait: invalid argument (Bad file descriptor) or Caught in client Network.Socket.recvBuf: invalid argument (Bad file descriptor).

In terms of the general behavior of the RTS API, the race should either be won by threadWaitRead (which subsequently should then get woken up when the closeFdWith got its turn), or by closeFdWith and threadWaitRead should fail immediately.

Actual behavior

What I actually observe is that after some time of the above print loop, it prints

Caught in client modifyFdOnce: invalid argument (Bad file descriptor)
Connecting reader
Caught in closer: epollControl: does not exist (No such file or directory)

And then it just completely locks up.

Initial research

Debugging this problem in our production application with strace, I found that the epoll_ctl from threadWaitRead and the close called from closeFdWith arrived almost simultaneously, with the close interrupting the epoll_ctl, causing it to return EBADF.

In such a case, the epoll backend would throw (https://gitlab.haskell.org/ghc/ghc/-/blob/ghc-9.0.2-release/libraries/base/GHC/Event/EPoll.hsc#L93-105), it appears to only handle ENOENT, and only in specific cases.

But from my (limited) understanding of that code, the IO manager never expects IO backends to throw, judging by the use of a Bool return value and the check here.

Consequently, with the exception being thrown from there, it'll miss the cleanup in the else-branch at that location, which means that it'll leave the invalid file descriptor in the table, that it added here, even though it was never successfully registered with epoll.

In the next iteration, this left-over file descriptor gets reused. Once the closer thread tries to call close on the socket, closeFdWith will see a registration in its management tables and thus try to unregister the file descriptor, happening here.

But since the FD is from its previous incarnation, the current instance of it (that the IO manager sees in its table) has never been registered with the epoll instance - and thus we get the epollControl: does not exist (No such file or directory) error when it tries to remove it from the epoll instance (that it has never been added to).

This error comes at a very inconvenient place between a series of takeMVars and putMVars, meaning that we skip the putMVar part of it - explaining the lock up (since subsequent calls to the IO manager no longer have access to its tables from those particular MVars).

I hope that helps!

Environment

GHC version used: 9.0.2 (but also observed this in a (closed-source) production application in 8.10.7)

Optional:

Operating System: Ubuntu 20.04
System Architecture: x86-64

Edited Aug 08, 2022 by Fabian Thorand

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information