Skip to content

Relax load_load_barrier for aarch64

Takenobu Tani requested to merge takenobu-hs/ghc:wip/load_barrier_aarch64 into master

This patch relaxes the instruction for load_load_barrier(). Current load_load_barrier() implements full-barrier with dmb sy. It's too strong to order load-load instructions. We can relax it by using dmb ld.

If current load_load_barrier() is used for full-barriers (load/store - load/store barrier), this patch is not suitable.

See also linux-kernel's smp_rmb() implementation:

Hopefully, it's better to use dmb ishld rather than dmb ld to improve performance. However, I can't validate effects on a real many-core Arm machine.

I've only checked this patch statically. I haven't validated this patch on an Arm many-core machine.

Merge request reports