DmdAnal: Do not only preserve precise exceptions, but also externally visible side-effects

Consider

import Control.Concurrent.MVar

data Box = Box ()

f :: Box -> Int -> (Box -> Int -> Int) -> MVar Int -> IO a
f x@(Box _) n g var = do
  let m = g x n
  putMVar var m -- substitute with any side-effect
  f x (n+1) g var

main :: IO a
main = newEmptyMVar >>= f (Box (error "boom")) 100 (\_ n -> n)

This is a play on #20111. If you compile this piece of code with -O1, main will be optimised to a crash, thus swallowing the side-effecting loop. With -O0, you can observe the putMVar from concurrent threads (say).

Clearly, this optimisation is unsound given any semantics that incorporates multi-threading and assuming that the putMVar write will eventually be flushed to shared memory.

It is also easy to fix, and I did so in ce64b397. But that led to #17653 (closed), where we regressed real-world code that assumed that a side-effecting memory write is strict in its continuation. Which of course it can't be; if I replace putMVar with e.g., a call to writeArray (the particular situation in #17653 (closed) had a call to the writeWord8OffAddr#), I still expect to get an infinite loop that writes to the array instead of a instantaneous crash without any write.

Simon writes:

I agree that there is a tension between compiling efficient code and being semantically predictable.

We are in a terrible place here where GHC optimises unsoundly, but it appears that real-world code relies on that unsound optimisation.

Simon also says

It's clearly a bit of a tension, but if we document it carefully (including why -- poster-child examples of an inner loop where we want strictness analysis to work), I think it'll be fine.

I think we that we should only do this change when we can monitor its impact on downstream libraries like bytestring, vector or base64-bytestring.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information