Skip to content

Fix IO manager state corruption through concurrent closing of file descriptors

If an IO manager backend throws, it will not actually have registered the file descriptor. However, at that point, the IO manager state was already updated to assume the file descriptor is being tracked, leading to errors and an eventual deadlock down the line as documented in the issue #21969.

The fix for this is to undo the IO manager state change in case the backend throws (just as we already do when the backend signals that the file type is not supported). The exception then bubbles up to user code.

That way we make sure that

  1. the bookkeeping state of the IO manager is consistent with the actions taken by the backend, even in the presence of unexpected failures, and
  2. the error is not silent and visible to user code, making failures easier to debug.

I will submit the reproducer from #21969 as a separate merge request to https://gitlab.haskell.org/ghc/head.hackage, because it relies on the network package and extracting the relevant parts would be rather involved. Due to the reproducer corrupting the IO manager state, and the tests in that repository all running in the same process, I'll see if I can instead get rid of the network dependency by replacing sockets with pipes, and then add the test to the GHC repo after all.

Edited by Fabian Thorand

Merge request reports