Flag for core-to-core pass that adds safe FFI bytearray pinnedness check
Motivation
I recently spent the better part of a day tracking down a mistake I made passing an unpinned byte array to the safe FFI. This kind of mistake is rather difficult to track down, but I have learned my lesson. I'm about to go back and add a flag named assertions
to my cabal file that will let toggle on runtime checks using isByteArrayPinned#
and isMutableByteArrayPinned#
on anything that uses the safe FFI.
Adding these checks is a rather mechanical process. These are checks that would benefit many other library authors. Most people aren't going to add an assertions
flag to their libraries because that's extra work, and why bother if nothing appears to be broken. Worsening this problem is that passing unpinned memory into the safe FFI can cause problems that almost never manifest themselves. If the FFI call completes quickly and without blocking, it's unlikely that the GC will run and relocate something the FFI call is using. Or maybe there's a doubling buffer that grows over time, and by the time the issue is more likely to manifest itself, the buffer is large enough that GHC's runtime decides to pin even without pinned being explicitly requested.
Proposal
Introduce a new flag --runtime-assertions
that turns on an early core-to-core pass that would perform the following transformation:
case {__pkg_ccall_GC example-lib-api-0.1.0.0
Int# -> MutableByteArray# RealWorld -> Int# -> MutableByteArray# RealWorld
-> State# RealWorld -> (# State# RealWorld, Int# #)
}_aacw w1 b1 w2 b2 s1
of
{ (# s2, w3 #) -> ...
}
Is turned into:
case isMutableByteArrayPinned# b1 of
{
__DEFAULT -> case raiseIO# SOME_ERROR_MESSAGE s1 of s2 {}
1# ->
case isMutableByteArrayPinned b2 of
{
__DEFAULT -> case raiseIO# SOME_ERROR_MESSAGE s1 of s2 {}
1# -> case {__pkg_ccall_GC example-lib-api-0.1.0.0
Int# -> MutableByteArray# RealWorld -> Int# -> MutableByteArray# RealWorld
-> State# RealWorld -> (# State# RealWorld, Int# #)
}_aacw w1 b1 w2 b2 s1
of
{ (# s2, w3 #) -> ...
}
}
}
To handle SOME_ERROR_MESSAGE
properly, it would be necessary to introduce a string at the top level mentioning the current module name. It is possible to support checks for things other than safe FFI calls. For example, this same flag could eventually check that array indexing is in bounds. However, I've decided to focus on the use of the safe FFI in this feature request since out-of-bounds array indexing nearly always leads to a crash or segfault anyway.
The goal I have in mind is that, in complicated projects with 100+ transitive dependencies, I could tell cabal to just build all my dependencies with --runtime-assertions
(kind of like a profiling build but for validation instead) and then run that version of the binary for a while to make sure that neither I nor any of the authors of my transitive dependencies made any safe FFI bytearray pinnedness mistakes.