Skip to content

winio: 18382 fix non-threaded manager non-termination issue

The issues with the pipes and the non-threaded runtime turn out to not have anything to do with pipes.

Essentially there is a deadlock happening in the scheduler, this deadlock happens because of a faulty assumption: We assumed that if you issue an I/O action that action would be the last tso being executed. This assumption isn't true as the use of green threads in the non-threaded RTS can have different scheduling depending on if the time slice for the action has ran out.

The deadlock happens because we:

  1. Issue an I/O operation and block on the I/O
  2. Execution of a different TSO continues on the same capability till that blocks on an MVar which is indirectly waiting on the I/O operation.
  3. The scheduler runs out of work and tries to scavenge work. Because the last TSO is an MVar block the scheduler concludes that we have deadlocked.
  4. We throw an exception on the main capability in an attempt to force the MVar to unblock.
  5. The exception is thrown on the main thread, so the I/O manager catches it too and proceeds to cancel any I/O request on the main thread.
  6. The cancellations are performed so the operation is no longer pending.
  7. However the main scheduler loops now continues and calls awaitEvent.

Because of threading non-determinism here one of two things can happen:

8.a) The cancellation reaches the I/O manager before the awaitEvent and we print out <<loop>>.

8.b) The awaitEvent gets processed before the cancellation and we block indefinitely.

The fix for this is quite simple, just move the code that checks for I/O operations pending before we check for deadlocks.

Fixes #18382 (closed)

Please backport to all active GHC 9.x branches. This fixes the last issue with winio and moves one step closer to #20255

/cc @bgamari @AndreasK

Edited by Tamar Christina

Merge request reports