SMP primitives broken on power(pc)
I originally noticed this when working on the AIX port (32-bit powerpc), and recently saw this also on Linux/powerpc64, which lead to talking to Peter Trommler who already had a suspicion:
Here's for example the CAS definition (in <stg/SMP.h>
):
StgWord
cas(StgVolatilePtr p, StgWord o, StgWord n)
{
StgWord result;
__asm__ __volatile__ (
"1: ldarx %0, 0, %3\n"
" cmpd %0, %1\n"
" bne 2f\n"
" stdcx. %2, 0, %3\n"
" bne- 1b\n"
"2:"
:"=&r" (result)
:"r" (o), "r" (n), "r" (p)
:"cc", "memory"
);
return result;
}
The important detail is the lack any barrier instructions, such as isync
at the end. This results in infrequent heap-corruptions which in turn result in all sorts of infrequent and hard to track down runtime-crashes (including in ghc --make -j
) such as for instance
internal error: END_TSO_QUEUE object entered!
(GHC version 8.0.0.20160421 for powerpc64_unknown_linux)
Peter has already a patch in the works which simply replaces the atomic powerpc primitives with __sync_*
intrinsics which turn out to be more portable than inline-asm. This would result in e.g.
StgWord
cas(StgVolatilePtr p, StgWord o, StgWord n)
{
return __sync_val_compare_and_swap (p, o, n);
}
which then gets compiled as
000000000000004c <.cas>:
4c: 7c 00 04 ac sync
50: 7d 20 18 a8 ldarx r9,0,r3
54: 7c 29 20 00 cmpd r9,r4
58: 40 c2 00 0c bne- 64 <.cas+0x18>
5c: 7c a0 19 ad stdcx. r5,0,r3
60: 40 c2 ff f0 bne- 50 <.cas+0x4>
64: 4c 00 01 2c isync
68: 7d 23 4b 78 mr r3,r9
6c: 4e 80 00 20 blr
I've been testing the patch already and it seems to have made all issues I experienced so far disappear, as well as fixing the concprog01
test which was also failing infrequently.