LLVM 3.2 crash (AVX messes up GHC calling convention)
I stumbled across the problem of LLVM 3.2 builds seemingly randomly crashing on my Mac. After quite a bit of investigation I found that the source of the problem was somewhere in the
base library (
span to be precise), where it encountered Cmm like follows:
cp7n: if ((Sp + -56) < SpLim) goto cp8c; else goto cp8d; ... cp8d: I64[Sp - 40] = block_cp7g_info; _smKL::P64 = P64[R1 + 7]; _smKO::P64 = P64[R1 + 15]; _smKP::P64 = P64[R1 + 23]; _smL0::P64 = P64[R1 + 31]; R1 = R2; P64[Sp - 32] = _smKL::P64; P64[Sp - 24] = _smKO::P64; P64[Sp - 16] = _smKP::P64; P64[Sp - 8] = _smL0::P64; Sp = Sp - 40; if (R1 & 7 != 0) goto up8R; else goto cp7h; up8R: call block_cp7g_info(R1) args: 0, res: 0, upd: 0; ...
which leads LLVM 3.2 to produce the following assembly:
_smL1_info: ## @smL1_info ## BB#0: ## %cp7n pushq %rbp movq %rsp, %rbp movq %r14, %rax movq %rbp, %rcx leaq -56(%rcx), %rdx cmpq %r15, %rdx jae LBB160_1 ... LBB160_1: ## %cp8d leaq _cp7g_info(%rip), %rdx movq %rdx, -40(%rcx) vmovups 7(%rbx), %ymm0 vmovups %ymm0, -32(%rcx) addq $-40, %rcx testb $7, %al je LBB160_4 ## BB#2: ## %up8R movq %rcx, %rbp movq %rax, %rbx popq %rbp vzeroupper jmp _cp7g_info ## TAILCALL
So here LLVM has figured out that it can use
vmovups in order to move 4 words at the same time. However, there is a puzzling side effect: All of sudden we have a
pushq %rbp at the start of the function with a matching
popq %rbp at the very end. This overwrites the stack pointer update (
movq %rcx, %rbp) and - unsurprisingly - causes the program to crash rather quickly.
My interpretation is that LLVM 3.2 erroneously thinks that AVX instructions are incompatible with frame pointer elimination. The reasoning is that this is exactly the kind of code LLVM generates if we disable this "optimisation" (
--disable-fp-elim). Furthermore, disabling AVX instructions (
-mattr=-avx) fixes the problem - LLVM falls back to the less efficient
pushq $rbp vanishing as well. Finally, this bug seems to happen exactly with LLVM 3.2, with 3.3 upwards generating correct code.
My proposed fix would be to add
-mattr=-avx to the
llc command line by default for LLVM 3.2. This issue might be related to #7694 (closed).