Consider using multiple threads to scavenge *very* large mutable arrays.
Currently a single array is only ever scanvenged by a single thread. This showed up in hashtable benchmarks and is discussed a bit here
Quoting the most relevant part:
Since writes to the hashtable are spread across the array, updating more than ~1% of the table between GCs will mean nearly the entire array will be scanned at every stop-the-world GC from a single thread while the others stall. The problem gets worse as the array size grows.
This means having one huge array will perform a lot worse than multiple arrays of the same size. This is a shame. I'm not sure if this would be easy to change, but it might be worthwhile.
Currently this is implemented in scavenge_mut_arr_ptrs_marked
. To solve this we would want to distribute scavenging of the array across threads for arrays past a certain size.