Non-moving segment allocation strategy might be leading to fragmentation
Summary
I think the current allocation strategy for the non-moving heap might be to unnecessary fragmentation
Issue
The non-moving heap consists of segments that are aligned and 32K (that's 8 4K blocks).
In order to allocate aligned blocks we currently allocate (2*b - 1)
blocks and find the required aligned part and then immediately de-allocate the rest. See: https://gitlab.haskell.org/ghc/ghc/-/blob/ef3d20f83499cf129b1cacac07906b8d6188fc17/rts/sm/BlockAlloc.c#L570
My worry is that this strategy can lead to a situation where the heap ends up with lots of small free block groups (less than 15 blocks) that cannot be used by the allocator to service non-moving segment requests.
For instance, we could have a situation where every second 8 block group is free on a megablock, but we can't allocate into them because we need 15 to guarantee alignment.
In the past, I've found that adding an extra copying generation to a heap that uses the non-moving GC really reduces fragmentation. I think the reason is that it can use these blocks that are otherwise unusable.
I don't have concrete evidence for this yet. I'll try to gather some more data and report back. My plan is to add some instrumentation that records the sizes of free block groups throughout the lifetime of an application.
Proposed solution
I think the best thing to do here is to claim entire megablocks for the use of the non-moving heap. So it would have a list of megablocks with free space. If a megablock ends up completely free then we de-allocate it, but otherwise it would be held on a list by the non-moving gc.
It would probably be good to make this configurable since it might lead to some overhead, since now no other allocation could go into these blocks.
This also requires a bit more book keeping than the current approach.
The advantage is that we could basically eliminate this sort of fragmentation.
Implementation
If this sounds like a good idea, I'd be happy to implement.