Skip to content

Representational polymorphism of catch# is unsound

While looking at the failure of UnliftedWeakPtr in !8597 (closed) I realized that our use of RuntimeRep polymorphism in operations like catch# and keepAlive# is unsound.

Minimal reproducer

The following program segmentation faults on master with both the NCG and unregisterised backends:

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}

import GHC.IO
import GHC.Exts

main :: IO ()
main = do
    r <- IO $ \s0 ->
        case catch# io handler s0 of
           (# s1, (# n0,n1,n2,n3,n4,n5,n6,n7 #) #) -> (# s1, (n0,n1,n2,n3,n4,n5,n6,n7) #)
    print r
    print "hello world"

type Result = (# Int, Int, Int, Int, Int, Int, Int, Int #)

{-# NOINLINE io #-}
io :: State# RealWorld -> (# State# RealWorld, Result #)
io s = (# s, (# n,n,n,n,n,n,n,n #) #)
  where
    n = 1

{-# NOINLINE handler #-}
handler e = undefined

What goes wrong?

To see why this fails, consider the following program:

inner :: State# RealWorld -> (# State# RealWorld, Int, Int #)
inner =
    ...
      (# s, a, b #)

do_it =
    ... keepAlive# x s inner

Recall that keepAlive# has the type:

keepAlive# :: forall (a_lev :: Levity) (a :: TYPE (BoxedRep lev))
                     (r_rep :: RuntimeRep) (r :: TYPE r_rep).
              a -> State# RealWorld -> (State# RealWorld -> r) -> r

We believed that the representationally polymorphic result type would be safe (because we merely return the result, we needn't ever bind it) and useful (since it allows the user to, e.g., unbox an Int result.

However, if we think about the operational behavior of this program, we see that there is trouble lurking. In particular, imagine that as we enter the keepAlive# call the state of the stack (when compiled unregisterised) is as follows:

    ┌──────────────────┐  ◄──── Sp
    │ inner            │
    ├──────────────────┤
    │ K                │
    └──────────────────┘

where

  • K represents the enclosing context (which expects to be returned a (# State# RealWorld, Int, Int #), which will be represented as two Int pointers at the head of the stack)
  • inner is the continuation argument being passed to keepAlive#
  • the first argument, x, is passed in `R1

When keepAlive# calls inner we will have:

 ┌──────────────────┐  ◄──── Sp
 │ keepAlive_frame  │
 │ x                │
 ├──────────────────┤
 │ K                │
 └──────────────────┘

Eventually inner will converge to a result, resulting in the stack:

 ┌──────────────────┐  ◄──── Sp
 │ a                │
 │ b                │
 ├──────────────────┤
 │ keepAlive_frame  │
 │ x                │
 ├──────────────────┤
 │ K                │
 └──────────────────┘

where a and b are inner's two results.

inner will then return to keepAlive_frame's entry code. However, the entry code has no way of knowing how many results are at the head of the stack.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information