gen_workspace alignment may not be enough on apple silicon
Currently in RTS, gen_workspace
is aligned to 64 bytes to prevent false sharing. However, different platforms may have different cache line sizes, e.g. according to Apple Silicon CPU Optimization Guide, apple silicon cache line size is 128 bytes. So we might still have unnoticed false sharing issue during parallel GC on apple silicon, and maybe we should have autoconf logic to write the target cache line size to a header and use that instead of hard-coded 64? cc @bgamari