Return memory to allocator when shrinking byte arrays
Motivation
In Linux, message-oriented protocols (e.g. datagram sockets, POSIX messages queues) require the user to guess the amount of memory that an individual message will use before receiving it. If you guess too low, the message you receive is truncated, and operating system removes the message from the queue, so you have no opportunity to recover the message. To work around this, it is common to choose numbers much larger than what you expect to receive. On Internet-domain datagram sockets, it's not good practice to send anything larger than around 1500 bytes, but I've seen devices that occasionally send syslog datagrams at upwards of 3500 bytes (usually firewall traffic that includes a lengthy URL a user visited). I really don't want to discard messages I receive, so I have to allocate unnaturally large buffers (MutableByteArray# RealWorld
) for every datagram I receive. The median size of a datagram I receive is around 500 bytes, so I end up allocating at least seven times the memory I actually need. I'm about to start using POSIX message queues in an application, and there, the problem will be even worse, since in that setting I will be expecting an even greater range of message sizes.
Proposal
I propose changing shrinkMutableByteArray#
so that it returns memory to the allocator when possible. When, precisely, is this possible? It is possible when the MutableByteArray#
resides at the end of a block that is currently being allocated into. That is, the free
field of the bdescr
must be the address that immediately succeeds the block. Which block should we check? If the byte array is unpinned, we must check the nursery (at the capability's r.rNursery
). If it is pinned, we must check the capability's pinned_object_block
. In either case, if free
is the successor to the last object in the byte array, I believe that it is sound to move free
backwards and reclaim the space.
Applications
This behavior is useful for:
- Message-oriented protocols. You can allocate a 64KB buffer and get all the unused space back. Moreover, you don't even have to be careful about avoiding allocations after
newByteArray#
. Since a 64KB byte array is always pinned, it's only other allocations into the pinned object block that would prevent reclamation. Allocations into the nursery would not prevent reclamation. - Temporary working areas. If you needed a working area (a byte array that will be discarded, not frozen) for an operation, you could shrink the byte array down to zero bytes at the end, reclaiming all the space except for the two machine words needed for the
header
andbytes
fields ofStgArrBytes
. It would require some amount of effort to make an algorithm compatible with this optimization, since you have to make sure no other allocations into the same block list are performed after the working area, and it would have to be the right kind of algorithm (mergesort comes to mind as a candidate). I'm not sure if anyone would ever use the feature in this way, but the possibility is there.
Implementation
I'm happy to implement this. The implementation doesn't seem like it would be complicated. I would like some feedback on whether or not the "roll back the allocator" idea is actually sound.