The Hidden Dangers Of
Haskell is a very high-level language, and most of the time, programmers don't have to think about memory allocation concerns at all, and the compiler and runtime do a good job of managing these things correctly.
However, there are situations where we do need to meddle with low-level concerns of memory allocation and pointers after all, for example:
- ...when runtime performance is critical
- ...when writing FFI code wrapping C libraries not written with Haskell in mind
- ...when implementing base library functionality or foundational data structures
In such situations, the compiler's assumption of free reign over low-level memory management concerns, and our desire to manipulate pointers and memory allocation more directly, are at odds: for example, taking a pointer to a managed value may lead to that pointer escaping the managed value's lifespan. For example, consider this hypothetical example:
main = do p <- allocaBytes 4 poke p (12345 :: Word32) peek p >>= putStrLn
If we were to implement the above, we would have to decide how to manage the lifecycle of the allocated memory. But none of the options is great:
- We can allocate unmanaged memory through
malloc()and just use that; it will stay alive until the process exits. This is undesirable because we're leaking memory.
- We can allocate unmanaged memory through
malloc()and leave the responsibility for cleaning up to the caller. The existing
mallocBytesfunction does exactly that, but it is often undesirable because it is brittle; we want something that automatically cleans up after itself.
- We can allocate managed memory and thus leave cleanup to the GC. The problem with this is that we're not retaining any references that the GC can follow, so our allocated memory becomes eligible for GC right after we return from
A somewhat obvious solution is to change
allocaBytes to have a different type: instead of
allocaBytes :: Int -> IO (Ptr a), we make it
allocaBytes :: Int -> (Ptr a -> IO b) -> IO b. This is the tried-and-tested bracket pattern, allowing us to run deallocation code after the payload action has finished. A simple implementation might look like this:
allocaBytes n action = do p <- mallocBytes n retval <- action p free p return retval
And this would actually work. However,
malloc isn't very efficient compared to the Haskell heap, so ideally, we would want to have the cake and eat it by allocating from the Haskell heap, while still keeping the memory alive until the action has finished. This means that we need a magical token that tells the GC that the memory we allocated is still alive, and which we can place in a strategic position right after the action.
That magical construct is the
touch# primop, which essentially says "I will need this value to be alive until at least this point". So
allocaBytes can be implemented as the moral equivalent of:
allocaBytes n action = do a <- newByteArray n p <- addressOf a retval <- action p touch# a return retval
Great! However, as evidenced by #14346 (closed), there is a problem. The correctness of the above code hinges on the presence of
touch# (without it, the GC is free to pick up
action p finishes), but GHC's optimizer doesn't actually know this. Specifically, if
action can be proven to never return, then the dead code elimination optimization will happily delete the
return parts, and we're back at square 1.
The first workaround for this was to add a
NOINLINE pragma on
allocaBytes - without inlining
allocaBytes, the dead code elimination stops at the function boundary, and cannot proceed to removing
- From the call site, we can infer that
actiondoesn't terminate, but we cannot step into the implementation of
allocaBytesto continue our analysis.
- From inside
allocaBytes, we cannot tell whether
actionterminates, so we cannot eliminate the
This prevents the problem, but it's not a great solution. By preventing inlining, we may also miss out on other, valid, optimizations, so it's a sledgehammer solution; and it's also one that requires manual diligence - we have to get the inlining pragmas just right for every usage of
touch#. Clearly this isn't ideal.
So we take a different route. Instead of a primop that says "please consider this value alive at this point in time", we provide one that says "please consider this value alive as long as the primop itself is alive". That primop is
with#, and it is the moral equivalent of
with :: a -> IO b -> IO b, meaning: "keep the first argument alive for as long as the second argument runs". And we can rewrite
allocaBytes to the moral equivalent of:
allocaBytes n action = do p <- allocateMemory n with# p action
Now the inferred lifecycle of both
p accurately reflect our intentions. In a way, we have baked the bracket pattern into the
Note that the actual implementations of
touch#, look slightly different, due to the fact that
IO-like primops are implemented in terms of the "secret" underlying representation of
State# RealWorld -> (# State# RealWorld, a #) rather than