GHC parallel GC is not doing well on modern many-core machine
I'm testing a small ray-tracer on different many-core machines, like x64 88 core and Aarch64 96 core (on https://packet.net). Parallel GC seems to have throughput problems on more than 24-32 cores.
See this Reddit thread about - https://www.reddit.com/r/haskell/comments/85vwlq/our_lovely_ghc_parallel_gc_on_96_core_arm/
There you may find .eventlog file and PNG with a screenshot.
May be it's time to resurrect Concurrent GC project again?