Representational polymorphism of catch# is unsound
While looking at the failure of UnliftedWeakPtr
in !8597 (closed) I realized that our use of RuntimeRep
polymorphism in operations like catch#
and keepAlive#
is unsound.
Minimal reproducer
The following program segmentation faults on master
with both the NCG and unregisterised backends:
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
import GHC.IO
import GHC.Exts
main :: IO ()
main = do
r <- IO $ \s0 ->
case catch# io handler s0 of
(# s1, (# n0,n1,n2,n3,n4,n5,n6,n7 #) #) -> (# s1, (n0,n1,n2,n3,n4,n5,n6,n7) #)
print r
print "hello world"
type Result = (# Int, Int, Int, Int, Int, Int, Int, Int #)
{-# NOINLINE io #-}
io :: State# RealWorld -> (# State# RealWorld, Result #)
io s = (# s, (# n,n,n,n,n,n,n,n #) #)
where
n = 1
{-# NOINLINE handler #-}
handler e = undefined
What goes wrong?
To see why this fails, consider the following program:
inner :: State# RealWorld -> (# State# RealWorld, Int, Int #)
inner =
...
(# s, a, b #)
do_it =
... keepAlive# x s inner
Recall that keepAlive#
has the type:
keepAlive# :: forall (a_lev :: Levity) (a :: TYPE (BoxedRep lev))
(r_rep :: RuntimeRep) (r :: TYPE r_rep).
a -> State# RealWorld -> (State# RealWorld -> r) -> r
We believed that the representationally polymorphic result type would be safe (because we merely return the result, we needn't ever bind it) and useful (since it allows the user to, e.g., unbox an Int
result.
However, if we think about the operational behavior of this program, we see that there is trouble lurking. In particular, imagine that as we enter the keepAlive#
call the state of the stack (when compiled unregisterised) is as follows:
┌──────────────────┐ ◄──── Sp
│ inner │
├──────────────────┤
│ K │
└──────────────────┘
where
-
K
represents the enclosing context (which expects to be returned a(# State# RealWorld, Int, Int #)
, which will be represented as twoInt
pointers at the head of the stack) -
inner
is the continuation argument being passed tokeepAlive#
- the first argument,
x
, is passed in `R1
When keepAlive#
calls inner
we will have:
┌──────────────────┐ ◄──── Sp
│ keepAlive_frame │
│ x │
├──────────────────┤
│ K │
└──────────────────┘
Eventually inner
will converge to a result, resulting in the stack:
┌──────────────────┐ ◄──── Sp
│ a │
│ b │
├──────────────────┤
│ keepAlive_frame │
│ x │
├──────────────────┤
│ K │
└──────────────────┘
where a
and b
are inner
's two results.
inner
will then return to keepAlive_frame
's entry code. However, the entry code has no way of knowing how many results are at the head of the stack.