Rethinking touch# primop
NOTE: the discussion has been moved on the wiki: https://gitlab.haskell.org/ghc/ghc/wikis/proposal/with-combinator
All GHC has to control object lifetime is:
touch# :: a -> State# RealWorld -> State# RealWorld
This is often useful when passing GC-managed objects (typically pinned byte arrays) to foreign functions through foreign data structures. For instance, the POSIX vectored I/O interface exposes this interface:
struct iovec {
void *iov_base; /* Starting address */
size_t iov_len; /* Number of bytes to transfer */
};
ssize_t readv(int fd, struct iovec iovs[iov_count], int iov_count);
We may, for instance, wish to populate iov_base
with a pinned ByteArray
:
data IoVector = IoVector { iov_base :: Addr, iov_len :: CSize }
instance Storable IoVector
foreign import ccall "readv" c_readv
:: CInt -> Ptr IoVector -> CInt -> IO CSize
doRead :: Fd -> [IoVector] -> IO CSize
doRead fd vecs = withArray vecs $ \vecsPtr -> readv fd vecsPtr (length vecs)
read :: Fd -- ^ file descriptor from which to read
-> CSize -- ^ length to read
-> IO ByteArray
read fd len = do
arr <- newPinnedByteArray 42
result <- doRead fd [IoVector (byteArrayContents arr) len]
touch# arr
However, this is dangerously susceptible to being dropped by the simplifier. For instance, if doRead
were change such that the simplifier concluded that it will fail to return (e.g.
because it is of the form forever ...
), then it is tempted to drop the
continuation containing touch#
. This results in the garbage collector
inappropriately freeing arr
, resulting in catastrophe.
It caused #14346 (closed) (allocaBytes
and allocaBytesAligned
) and #17746 (withForeignPtr
).
Mitigation
A way to mitigate the issue is to ensure that functions using touch#
can't be simplified by using NOINLINE
pragmas.
-
#14346 (closed) has been fixed this way in 8.2 and 8.6 by adding a NOINLINE pragma to
allocaBytes[Aligned]
functions (cf 56590db0). -
#17746: adding a NOINLINE pragma to
withForeignPtr
fixes the issue but the price in performance is very high. An alternative is to rewrite as follows so that only the second field of theForeignPtr
is allocated and kept alive (instead of the wholeForeignPtr
). Sadly it still has a huge impact on performance metrics (withForeignPtr
is used a lot, especially inByteString
implementation).
withForeignPtr fo io
= let !(ForeignPtr addr r) = fo
IO fio = io (Ptr addr)
in IO $ \s -> with# r fio s
{-# NOINLINE with# #-}
with# :: a -> (State# RealWorld -> (# State# RealWorld, b #)) -> State# RealWorld -> (# State# RealWorld, b #)
with# a m s =
case m s of
(# s', r #) -> (# touch# a s', r #)
with#
primop?
Fixing the issue properly with a new To fix the issue it has been proposed (#14375 (closed), !2566 (closed)) to make the with#
function above a primop of this form:
with# :: a -> (State# s -> (# State s, r #)) -> State# s -> (# State# s, r #)
which evaluates the r
, ensuring that the a
remains alive throughout evaluation.
If we rewrite the test
example above with with#
, we get:
test :: IO ()
test = do
arr <- newPinnedByteArray 42
with# arr (unIO (doSomething (byteArrayContents arr)))
This construction the compiler can't mangle as there is no continuation to drop.
The naive implementation of !2566 (closed) is suboptimal as it allocates a closure for the continuation. This is strictly worse than the status quo provided by touch#
which is a no-op that tricks the codeGen into keeping the value alive in a register (or spilled on the stack). #16098 proposes a special code generation for with#
, pushing a new kind of stack frame that keeps a value alive for its scope. !2567 is an implementation of this idea.