GHC first generation of GC to be as large as largest cache size by default
This will improve performance a lot by default.
If anyone needs different size - there is "-A" RTS option. Machines are very different. Since 8.2 this size has been 1MB by default for all different architectures and hardware no matter what. In most cases machines with larger caches have more RAM as well and vice versa. So this will affect positively both small and larger machines. It will be most efficient in most cases to keep short lived objects in caches. Most modern workstation and server machines have L3 cache as well, that is why I'm asking for "largest cache size".
Second idea will be if there are two short lived generations on machines with second and third level caches with sizes that match both.
For NUMA machines with non-unified caches (like this strange and non-common ARM) the common solution could be to set first generation to be with size of the largest cache of smallest core. Which will not be the optimal, but close to.