fixST is a bit wrong

changed milestone to %8.6.1

changed weight to 5

added Tbug Trac import core libraries labels

Wow, good catch. I suppose we should do what fixIO does, since this is clearly wrong. But maybe there's a way to do it with IORef that would be cheaper than MVar? (we'd need benchmarks to check, though)

assigned to @treeowl

I raised this issue on the libraries list. Levent Erkök was first to respond. He recognized that this bug leads to a violation of the MonadFix laws, and surprising behavior, but that it was harmless enough to get away with just documenting it. Then Arseniy Alekseyev came up with a more frightening example, in which this feature of fixST turns safe code using unsafe operations into unsafe code. That was enough to tilt Edward Kmett toward fixing.

Replying to [ticket:15349#comment:156489 simonmar]:

Wow, good catch. I suppose we should do what fixIO does, since this is clearly wrong. But maybe there's a way to do it with IORef that would be cheaper than MVar? (we'd need benchmarks to check, though)

I don't think there is. I believe that runs into the same problem as unsafeFixIO: trouble with multiple threads. In fixST f, f may spark computations (runEval, runPar, etc.) that demand the final result; those should block until the result is available. Just like fixIO, what we really want here is an IVar, and we don't (yet?) have those natively.

The only alternative I see for now is to use noDuplicate# to fix up the implementation, like maybe

fixST :: (a -> ST s a) -> ST s a
fixST k = ST $ \ s ->
    let ans       = liftST (noDuplicate >> k r) s
        STret _ r = ans
    in
    case ans of STret s' x -> (# s', x #)

That implementation is also available for fixIO, of course. I don't have a terribly good sense of how it would compare.

Update: I cannot replicate Arseniy's example with GHC 8.4.2. He says that he's observing it on GHC 8.2.2; but I don't have that version lying around to replicate.

With 8.4.2, Arseniy's example produces 0 with no-eager-blackholing, and <<loop>> with eager-blackholing. (Note that you also have to pass -O2). So it's about the same as with David's original example in terms of brokenness.

So, something got fixed between 8.4.2 and 8.2.2 that the seg-fault disappeared. This doesn't mean Arseniy's example can't be replicated on 8.4.2. I think the next step should be to replicate that on latest GHC and decide from there. (Or, if we can't replicate, figure out what got it fixed.)

Trac metadata

Trac field	Value
CC	simonmar → lerkok, simonmar

I don't have GHC 8.4.2. I tried with 8.4.3 and it does segfault, both with and without -O2.

Trac metadata

Trac field	Value
Differential revisions	- → D4948

I've put up a differential to fix strict ST. Unfortunately, all my attempts thus far to fix lazy ST have been extremely fragile. I don't really understand what's going on well enough to see exactly what the trouble is, let alone how to fix it. My lazy ST test case:

foo :: ST s Int
foo = do
  ref <- strictToLazyST (newSTRef True)
  fixST $ \res -> strictToLazyST $ do
    x <- readSTRef ref
    if x
      then do
        writeSTRef ref False
        return $! (res + 5)
      else return 10

main = print $ runST foo

mentioned in commit 5a49651f

If we can merge this, that would be nice. We don't yet have a fix for lazy ST, and I honestly don't know what that should look like. It ties my brain up in knots.

added 1 deleted label

Let's try to think this lazy thing through a little bit. In the general case, we have something like

runST $ (m >>= \x -> fixST (f x)) >>= g

There are lots of possible ways the dependencies could go. g may demand a value and/or a state token. Depending on that, f x may demand a value and/or a state token.

If f x does not end up demanding a state token, then that means it doesn't mutate anything, so it's safe to duplicate its evaluation (but also perfectly fine to ensure that doesn't happen, of course). On the flip side, the MVar solution can easily fall apart here; if we read a value from an MVar, we have to make sure to run an action to fill it. Furthermore, we can't just stick MVar creation into the state thread, because that will demand a state token that we're not allowed to demand.

If f x demands a state token, then it might perform mutation, and therefore mustn't be duplicated.

I took a deep dive into lazy ST and came up with an absurdly inefficient "reference implementation" that I believe should be extremely correct. How inefficient? The monadic bind creates three MVars and two green threads! I wonder if someone has a good idea about how to turn that into something both correct and efficient. The idea here is to turn each suspended computation into its very own green thread, and to use MVars to communicate between them. One MVar requests a state token, while another is used to transfer one.

{-# language MagicHash, UnboxedTuples, GADTs, RankNTypes, BangPatterns #-}
module Control.Monad.ST.Lazy.Imp where
import qualified Control.Monad.ST as ST
import qualified GHC.ST as ST

import GHC.IO
import GHC.Exts
import Control.Concurrent.MVar
import Control.Monad
import Control.Applicative
import Control.Concurrent

infixl 1 :>>=

data ST s a where
  Pure :: a -> ST s a
  StrictToLazyST :: ST.ST s a -> ST s a
  (:>>=) :: ST s a -> (a -> ST s b) -> ST s b
  FixST :: (a -> ST s a) -> ST s a

strictToLazyST :: ST.ST s a -> ST s a
strictToLazyST = StrictToLazyST

instance Functor (ST s) where
  fmap = liftM
instance Applicative (ST s) where
  pure = Pure
  (<*>) = ap
  liftA2 = liftM2
instance Monad (ST s) where
  (>>=) = (:>>=)

data State s = State (State# s)

-- We don't care about thread IDs
forkIO_ :: IO () -> IO ()
forkIO_ m = void (forkIO m)

run
     -- Request and receive a state token
  :: MVar () -> MVar (State RealWorld)
     -- Wait for a request and provide a state token
  -> MVar () -> MVar (State RealWorld)
  -> ST RealWorld a -> IO a

run !s_in !m_in !s_out !m_out (Pure a) = do
  forkIO_ $ do
    readMVar s_out                  -- If we need the state,
    _ <- tryPutMVar s_in ()         -- request the state
    takeMVar m_in >>= putMVar m_out -- and transfer it
    pure ()
  pure a

run s_in m_in _s_out m_out (StrictToLazyST (ST.ST m)) = do
  putMVar s_in () -- Request the state
  State s <- takeMVar m_in -- Get the state
  case m s of
    (# s', a #) -> do
      putMVar m_out (State s') -- Put the new state
      pure a

-- This is the hard case. We have to 'run' @n@ if we need
-- *either* its state token *or* its value.
run s_in m_in s_out m_out (n :>>= f) = do
  sn_out <- newEmptyMVar
  n_out <- newEmptyMVar
  resv <- newEmptyMVar
  -- run_it gets filled if we need to run @n@, either for its
  -- value or for its state.
  run_it <- newEmptyMVar
  forkIO_ $ readMVar sn_out >> tryPutMVar run_it () >> return ()
  forkIO_ $ do
    readMVar run_it
    res <- run s_in m_in sn_out n_out n
    putMVar resv res
  run sn_out n_out s_out m_out
    (f $ unsafeDupablePerformIO $
           tryPutMVar run_it () >> readMVar resv)

run s_in m_in s_out m_out (FixST f) = do
  resv <- newEmptyMVar
  res <- run s_in m_in s_out m_out (f $ unsafeDupablePerformIO $ readMVar resv)
  putMVar resv res
  pure res

runST :: (forall s. ST s a) -> a
runST st = runRW# $ \s ->
  let ss = State s
  in case unIO
    (do
       s_in <- newEmptyMVar
       m_in <- newMVar ss
       s_out <- newEmptyMVar
       m_out <- newEmptyMVar
       run s_in m_in s_out m_out st) s of
    (# _, a #) -> a

Note: the crazy implementation above never actually needs to takeMVar; readMVar should be sufficient. So IVars would be sufficient. But to get close to something practical, we need to avoid forkIO. Maybe I can do that, but I don't see a way as yet.

removed 1 deleted label

Trac field	Value
Version	8.5
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Core Libraries
Test case
Differential revisions
BlockedBy
Related
Blocking
CC	simonmar
Operating system
Architecture

fixST is a bit wrong

Child items ...

Activity