STM: Locking approach during transaction validation can cause lack of progress
Summary
The current STM implementation can result in situations where even two read-only transactions running in parallel offer no guarantee of the transactions making progress. (However nothing suggests this behaviour is limited to read-only transactions).
This is one of the STM issues surfaced by #24142
Steps to reproduce
This can reproduced on these files:
stm-starvation.cabal :
cabal-version: 3.0
name: stm-starvation
version: 0.1.0.0
maintainer: Xander van der Goot
build-type: Simple
extra-doc-files: README
common opt
ghc-options:
-Wall -Wpartial-fields -O2 -rtsopts -debug -threaded -Wunused-packages
"-with-rtsopts=-N"
build-depends:
async,
base,
stm,
executable map
import: opt
main-is: Map.hs
hs-source-dirs: app
default-language: Haskell2010
build-depends:
stm-containers,
list-t
Map.hs:
{-# LANGUAGE NumericUnderscores #-}
import Control.Concurrent (ThreadId, myThreadId)
import qualified Control.Concurrent.Async as Async
import Control.Concurrent.STM (STM, atomically)
import Data.IORef (IORef, modifyIORef', newIORef, readIORef)
import GHC.IO (unsafePerformIO)
import qualified ListT
import qualified StmContainers.Map as STMMap
import System.IO (hFlush, stdout)
main :: IO ()
main = do
let l = [(x,x) | x <- [1 :: Integer .. 100_000]]
stmmap <- STMMap.newIO
atomically $ mapM_ (\(x, y) -> STMMap.insert x y stmmap) l
print "Initialized the STM Map."
hFlush stdout
-- Show that a read from a single thread does succeed
readMap stmmap
-- read the stm map from two threads concurrently
foldr1 Async.concurrently_ [readMap stmmap, readMap stmmap]
-- | irrevocable counter to keep track of transaction restart.
restartCounter :: ThreadId -> IORef Integer -> STM ()
restartCounter threadId counter = unsafePerformIO $ do
r <- readIORef counter
print (threadId, r)
hFlush stdout
modifyIORef' counter (+ 1)
return (return ())
readMap :: (Show a, Show b) => STMMap.Map a b -> IO ()
readMap stmmap = do
threadId <- myThreadId
counter <- newIORef (0 :: Integer)
_ <- atomically $ do
restartCounter threadId counter
ListT.toList (STMMap.listT stmmap)
return ()
Just run the program and it will take a unreasonable long time to finish, or might never finish depending on the system it's run on and luck.
Expected behavior
Since read-only transactions don't conflict with each other I would have assumed there is a guarantee of progress.
Analysis:
When a running thread returns to the scheduler it will validate its current STM transaction. This involves locking all the TVars in the transaction optimistically. If locking of any of the tvars fails the validation will fail and we will assume it's due to a data conflict and abort the transaction, forcing it to be restarted.
This is problematic if multiple threads running transactions on the same set of tvars return to the scheduler in parallel. Both threads will try to lock the same set of tvars in parallel, guaranteeing one of them fails. Resulting in one of the two transactions being aborted and restarted.
Since it's dependent on timing which of the two transactions get's forcibly restarted, it's possible that both transactions get restarted repeatedly, in the worst case infinitely often preventing any progress.
This for example might look something like this:
Thread1: run_t-> validate -> run_t -> validate -> run_t -> validate fails -> abort -> restart -> run_t -> ..
Thread2: run_t-> validate fails -> abort -> restart -> run_t -> validate -> run_t -> validate -> run_t -> ..
Thread1: validate -> run_t -> validate -> run_t -> ...
Thread2: -> validate fails -> abort -> restart -> ...
Solutions:
I think the underlying issue is that the current algorithm causes transaction validation to fail if a tvar is only temporarily locked, even if the memory is still consistent with the transactions view of memory.
The solution is to allow us to distinguish between locks failing because of inconsistent memory states, and locks failing for "spurious" reasons.
Environment
- GHC version used: Any
Optional:
- Operating System:
- System Architecture: Any SMP