In STM: Variables only in left branch of orElse can invalidate the right branch transaction
I'm sorry if this is expected behavior; I'm still trying to wrap my head around this. I was surprised to learn that the right branch of an
orElse can be invalidated by changes to the world only visible by the left branch. This might lead to starvation or performance issues for the use-case I was envisioning it for (e.g. several threads trying to take from one of a large set of TMVars in a randomized round-robin fashion).
Here is a really bad example demonstrating current behavior:
import Control.Concurrent import Control.Concurrent.STM import Control.Concurrent.STM.TSem import Control.Monad import System.IO import Debug.Trace main = do hSetBuffering stdout NoBuffering noMansLand <- replicateM 998 $ newTVarIO 0 t0 <- newTVarIO (1::Int) t999 <- newTVarIO (-1) let ts = t0:noMansLand++[t999] done <- atomically $ newTSem 0 forkIO $ atomically $ nestedOrElseMap done ts -- need enough time here for nestedOrElseMap thread above to move past t0 -- in this version, the modifications to t0 force nestedOrElseMap to be restarted forkIO (trace "starting vexing!" $ forever $ atomically $ (modifyTVar' t0 (+1) >> trace "vex" (return ()))) -- in this version nestedOrElseMap causes this transaction to be restarted and never makes progress: --forkIO (atomically (trace "starting vexing!" $ forever $ (modifyTVar' t0 (+1) >> trace "vex" (return ())))) atomically $ waitTSem done putStrLn "No livelock! Did the t0 counter get incremented?: " atomically (readTVar t0) >>= print -- another thread begins modifying head ts after we've moved to the right branch nestedOrElseMap :: TSem -> [TVar Int] -> STM () nestedOrElseMap done ts = trace "nestedOrElseMap starting" $ foldl1 orElse $ map transaction $ zip [(1::Int)..] ts where transaction (cnt,v) = do n <- traceShow cnt $ readTVar v if n < 0 then trace "@" (modifyTVar' v (subtract 1)) >> signalTSem done else retry
I can see it possibly being useful that both branches of an
orElse see the same view of the variables they share, but current behavior seems overzealous in restarting the whole transaction.
Maybe this is an artifact of changes related to #7493 (closed)? Or maybe it's obvious why this behavior has to be the way it is and its just not clicking for me.