Skip to content

Breaking out of a loop with control0# produces incorrect behaviour under optimisation

Summary

When using the native delimited continuation primops to implement various things, I found that some sanity tests for my loop operations failed in extremely mysterious ways. break not only broke out of its parent loop, it also ate nearby IO. I've minimised the misbehaviour down to a non-inlining loop that puts forever between prompt# tag and control0# tag.

Steps to reproduce

Run ghc -O Broken.hs -o exe && ./exe.

Broken.hs

{-# LANGUAGE MagicHash, UnboxedTuples, BlockArguments #-}

import Prelude hiding (break)
import GHC.Exts (PromptTag#, newPromptTag#, prompt#, control0#)
import GHC.IO (IO(..), unIO)
import Control.Monad (forever)

main :: IO ()
main = do
  putStrLn "before"
  broken >>= putStrLn
  putStrLn "after"

broken :: IO String
broken = do
  loop \l -> do
    break l "broken"

{-# NOINLINE loop #-}
loop :: (PromptTag# a -> IO ()) -> IO a
loop f = IO \rw0 -> case newPromptTag# rw0 of
  (# rw1, tag #) -> prompt# tag (unIO (forever (f tag))) rw1

break :: PromptTag# a -> a -> IO b
break tag x = IO (control0# tag \_ rw1 -> (# rw1, x #))

Or: ghc -O BrokenST.hs -o exe && ./exe.

BrokenST.hs

{-# LANGUAGE MagicHash, UnboxedTuples, UnliftedNewtypes, DataKinds
           , RoleAnnotations, BlockArguments
#-}

import Prelude hiding (break)
import GHC.Exts
  (State#, ZeroBitType, PromptTag#, newPromptTag#, prompt#, control0#)
import GHC.ST (ST(..), runST)
import Unsafe.Coerce (unsafeCoerce#)
import Control.Monad (forever)

main :: IO ()
main = do
  putStrLn "before"
  putStrLn broken
  putStrLn "after"

broken :: String
broken = runST do
  loop \l -> do
    break l "broken"

{-# NOINLINE loop #-}
loop :: (Tag s a -> ST s ()) -> ST s a
loop f = newTag \tag -> prompt tag do
  forever (f tag)

break :: Tag s a -> a -> ST s b
break tag x = control0 tag \_ -> pure x


-- Drag the primops into ST kicking and screaming

type role Tag nominal representational
newtype Tag s a = Tag (PromptTag# a)

newTag :: (Tag s a -> ST s r) -> ST s r
newTag k = fromPrim \rw -> case newPromptTag# rw of
  (# rw', tag #) -> toPrim (k (Tag tag)) rw'

prompt :: Tag s a -> ST s a -> ST s a
prompt (Tag tag) = primitively (prompt# tag)

control0
  :: Tag s a
  -> ((ST s b -> ST s a) -> ST s a)
  -> ST s b
control0 (Tag tag) withk
  = fromPrim $ control0# tag (toPrim . withk . primitively)

type Prim s a = State# s -> (# State# s, a #)

-- The only unsafe coercions we perform are between ZeroBitTypes.
-- IIUC, this shouldn't ever produce a segfault.
rethreadPrim :: Prim rw1 a -> Prim rw2 a
rethreadPrim primOp = \t -> case primOp (unsafeCoerceZeroBit t) of
  (# t', a #) -> (# unsafeCoerceZeroBit t', a #)
 where
  unsafeCoerceZeroBit :: forall (a :: ZeroBitType) (b :: ZeroBitType). a -> b
  unsafeCoerceZeroBit = unsafeCoerce#

toPrim :: ST s a -> Prim rw a
toPrim = rethreadPrim . unST
 where
  unST (ST prim) = prim

fromPrim :: Prim rw a -> ST s a
fromPrim = ST . rethreadPrim

primitively :: (Prim rw a -> Prim rw b) -> ST s a -> ST s b
primitively hoPrimOp = fromPrim . hoPrimOp . toPrim

Expected behavior

The first reproducer Broken.hs will only putStrLn "before" then exit without reporting an error.

However, it's greatly simplified from its original form wherein the control operations were constrained to an ST-like monad. Previously, it was even more frightening: to translate (as it were) broken would have been a pure String produced by ~runST, yet putStrLn broken would still eat itself and "after" too.

I attempted to reconstruct these circumstances in a second reproducer BrokenST.hs but found something new: instead of silently eating some IO, it segfaults.

In both cases the correct behaviour can be observed by compiling without optimisation or allowing loop to inline.

Environment

  • GHC version used: 9.6.6, 9.8.2, 9.10.1

Optional:

  • Operating System: NixOS
  • System Architecture: x86_64
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information