When this is compiled with -O -feager-blackholing, it produces a <<loop>> exception as expected. When it's compiled with -O0 or with -fno-eager-blackholing, it prints 15.
Should we reimplement fixST to do something like what fixIO does? Or should we consider this sometimes-lost bottom tolerable?
Trac metadata
Trac field
Value
Version
8.5
Type
Bug
TypeOfFailure
OtherFailure
Priority
normal
Resolution
Unresolved
Component
Core Libraries
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
simonmar
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Wow, good catch. I suppose we should do what fixIO does, since this is clearly wrong. But maybe there's a way to do it with IORef that would be cheaper than MVar? (we'd need benchmarks to check, though)
I raised this issue on the libraries list. Levent Erkök was first to respond. He recognized that this bug leads to a violation of the MonadFix laws, and surprising behavior, but that it was harmless enough to get away with just documenting it. Then Arseniy Alekseyev came up with a more frightening example, in which this feature of fixST turns safe code using unsafe operations into unsafe code. That was enough to tilt Edward Kmett toward fixing.
Wow, good catch. I suppose we should do what fixIO does, since this is clearly wrong. But maybe there's a way to do it with IORef that would be cheaper than MVar? (we'd need benchmarks to check, though)
I don't think there is. I believe that runs into the same problem as unsafeFixIO: trouble with multiple threads. In fixST f, f may spark computations (runEval, runPar, etc.) that demand the final result; those should block until the result is available. Just like fixIO, what we really want here is an IVar, and we don't (yet?) have those natively.
Update: I cannot replicate Arseniy's example with GHC 8.4.2. He says that he's observing it on GHC 8.2.2; but I don't have that version lying around to replicate.
With 8.4.2, Arseniy's example produces 0 with no-eager-blackholing, and <<loop>> with eager-blackholing. (Note that you also have to pass -O2). So it's about the same as with David's original example in terms of brokenness.
So, something got fixed between 8.4.2 and 8.2.2 that the seg-fault disappeared. This doesn't mean Arseniy's example can't be replicated on 8.4.2. I think the next step should be to replicate that on latest GHC and decide from there. (Or, if we can't replicate, figure out what got it fixed.)
I've put up a differential to fix strictST. Unfortunately, all my attempts thus far to fix lazyST have been extremely fragile. I don't really understand what's going on well enough to see exactly what the trouble is, let alone how to fix it. My lazy ST test case:
If we can merge this, that would be nice. We don't yet have a fix for lazy ST, and I honestly don't know what that should look like. It ties my brain up in knots.
Let's try to think this lazy thing through a little bit. In the general case, we have something like
runST$(m>>=\x->fixST(fx))>>=g
There are lots of possible ways the dependencies could go. g may demand a value and/or a state token. Depending on that, f x may demand a value and/or a state token.
If f x does not end up demanding a state token, then that means it doesn't mutate anything, so it's safe to duplicate its evaluation (but also perfectly fine to ensure that doesn't happen, of course). On the flip side, the MVar solution can easily fall apart here; if we read a value from an MVar, we have to make sure to run an action to fill it. Furthermore, we can't just stick MVar creation into the state thread, because that will demand a state token that we're not allowed to demand.
If f x demands a state token, then it might perform mutation, and therefore mustn't be duplicated.
I took a deep dive into lazy ST and came up with an absurdly inefficient "reference implementation" that I believe should be extremely correct. How inefficient? The monadic bind creates three MVars and two green threads! I wonder if someone has a good idea about how to turn that into something both correct and efficient. The idea here is to turn each suspended computation into its very own green thread, and to use MVars to communicate between them. One MVar requests a state token, while another is used to transfer one.
{-# language MagicHash, UnboxedTuples, GADTs, RankNTypes, BangPatterns #-}moduleControl.Monad.ST.Lazy.ImpwhereimportqualifiedControl.Monad.STasSTimportqualifiedGHC.STasSTimportGHC.IOimportGHC.ExtsimportControl.Concurrent.MVarimportControl.MonadimportControl.ApplicativeimportControl.Concurrentinfixl1:>>=dataSTsawherePure::a->STsaStrictToLazyST::ST.STsa->STsa(:>>=)::STsa->(a->STsb)->STsbFixST::(a->STsa)->STsastrictToLazyST::ST.STsa->STsastrictToLazyST=StrictToLazySTinstanceFunctor(STs)wherefmap=liftMinstanceApplicative(STs)wherepure=Pure(<*>)=apliftA2=liftM2instanceMonad(STs)where(>>=)=(:>>=)dataStates=State(State#s)-- We don't care about thread IDsforkIO_::IO()->IO()forkIO_m=void(forkIOm)run-- Request and receive a state token::MVar()->MVar(StateRealWorld)-- Wait for a request and provide a state token->MVar()->MVar(StateRealWorld)->STRealWorlda->IOarun!s_in!m_in!s_out!m_out(Purea)=doforkIO_$doreadMVars_out-- If we need the state,_<-tryPutMVars_in()-- request the statetakeMVarm_in>>=putMVarm_out-- and transfer itpure()purearuns_inm_in_s_outm_out(StrictToLazyST(ST.STm))=doputMVars_in()-- Request the stateStates<-takeMVarm_in-- Get the statecasemsof(#s',a#)->doputMVarm_out(States')-- Put the new statepurea-- This is the hard case. We have to 'run' @n@ if we need-- *either* its state token *or* its value.runs_inm_ins_outm_out(n:>>=f)=dosn_out<-newEmptyMVarn_out<-newEmptyMVarresv<-newEmptyMVar-- run_it gets filled if we need to run @n@, either for its-- value or for its state.run_it<-newEmptyMVarforkIO_$readMVarsn_out>>tryPutMVarrun_it()>>return()forkIO_$doreadMVarrun_itres<-runs_inm_insn_outn_outnputMVarresvresrunsn_outn_outs_outm_out(f$unsafeDupablePerformIO$tryPutMVarrun_it()>>readMVarresv)runs_inm_ins_outm_out(FixSTf)=doresv<-newEmptyMVarres<-runs_inm_ins_outm_out(f$unsafeDupablePerformIO$readMVarresv)putMVarresvrespureresrunST::(foralls.STsa)->arunSTst=runRW#$\s->letss=StatesincaseunIO(dos_in<-newEmptyMVarm_in<-newMVarsss_out<-newEmptyMVarm_out<-newEmptyMVarruns_inm_ins_outm_outst)sof(#_,a#)->a
Note: the crazy implementation above never actually needs to takeMVar; readMVar should be sufficient. So IVars would be sufficient. But to get close to something practical, we need to avoid forkIO. Maybe I can do that, but I don't see a way as yet.