GHC and LLVM don't agree on what to do with byteSwap16#
Consider this test case (taken from here and lightly modified to work on big/little endian machines):
{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE CPP #-}
module Main
( main -- :: IO ()
) where
#include "ghcconfig.h"
import GHC.Prim
import GHC.Word
data T = T !Addr#
t :: T
#ifndef WORDS_BIGENDIAN
t = T "\xcf\xb1"#
#else
t = T "\xb1\xcf"#
#endif
grabWord16 :: T -> Word64
grabWord16 (T addr#) = W64# (byteSwap16# (indexWord16OffAddr# addr# 0#))
trip :: Int
trip = fromIntegral (grabWord16 t)
main :: IO ()
main = print trip
With GHC 7.10.3 using the NCG, the results given are correct:
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.10.3
$ ghc -Wall -fforce-recomp -O2 Issue67.hs && ./Issue67
[1 of 1] Compiling Main ( Issue67.hs, Issue67.o )
Linking Issue67 ...
53169
This also is the same on GHC 8.0.1 using the NCG, on both PowerPC and AMD64 as well. This answer is correct: 53169
is 0xCFB1
in hex, so the byteSwap16#
primitive correctly works to decode the swapped-endian number.
However, the story is not the same with GHC 7.10.3+LLVM 3.5, or GHC 8.0.1+LLVM 3.7:
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.10.3
$ llc --version | head -2
LLVM (http://llvm.org/):
LLVM version 3.5.2
$ ghc -Wall -fforce-recomp -O2 Issue67.hs -fllvm && ./Issue67
[1 of 1] Compiling Main ( Issue67.hs, Issue67.o )
Linking Issue67 ...
-12367
Note:
-12367 == (fromIntegral (53169 :: Word16) :: Int16)
The relevant snippet looks like this at the CMM level (GHC 7.10.3):
==================== Output Cmm ====================
[section "data" {
Main.main2_closure:
const Main.main2_info;
const 0;
const 0;
const 0;
},
section "readonly" {
c3rq_str:
I8[] [207,177]
},
section "readonly" {
c3rr_str:
I8[] [207,177]
},
Main.main2_entry() // [R1]
{ info_tbl: [(c3ru,
label: Main.main2_info
rep:HeapRep static { Thunk }),
(c3rD,
label: block_c3rD_info
rep:StackRep [])]
stack_info: arg_space: 8 updfr_space: Just 8
}
{offset
c3ru:
...
c3ro:
I64[Sp - 16] = stg_bh_upd_frame_info;
I64[Sp - 8] = _c3rn::I64;
(_c3rw::I64) = call MO_BSwap W16(%MO_UU_Conv_W16_W64(I16[c3rr_str]));
I64[Sp - 24] = c3rD;
R4 = GHC.Types.[]_closure+1;
R3 = _c3rw::I64;
R2 = 0;
Sp = Sp - 24;
call GHC.Show.$wshowSignedInt_info(R4,
R3,
R2) returns to c3rD, args: 8, res: 8, upd: 24;
...
Pre-optimized LLVM basic block:
c3rB:
%ln3sc = ptrtoint i8* @stg_bh_upd_frame_info to i64
%ln3sb = load i64** %Sp_Var
%ln3sd = getelementptr inbounds i64* %ln3sb, i32 -2
store i64 %ln3sc, i64* %ln3sd, !tbaa !1
%ln3sf = load i64* %lc3rA
%ln3se = load i64** %Sp_Var
%ln3sg = getelementptr inbounds i64* %ln3se, i32 -1
store i64 %ln3sf, i64* %ln3sg, !tbaa !1
%ln3sh = ptrtoint %c3rE_str_struct* @c3rE_str$def to i64
%ln3si = inttoptr i64 %ln3sh to i16*
%ln3sj = load i16* %ln3si, !tbaa !5
%ln3sk = zext i16 %ln3sj to i64
%ln3sl = trunc i64 %ln3sk to i16
%ln3sm = call ccc i16 (i16)* @llvm.bswap.i16( i16 %ln3sl )
%ln3sn = sext i16 %ln3sm to i64
store i64 %ln3sn, i64* %lc3rJ
%ln3sp = ptrtoint void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @c3rQ_info$def to i64
%ln3so = load i64** %Sp_Var
%ln3sq = getelementptr inbounds i64* %ln3so, i32 -3
store i64 %ln3sp, i64* %ln3sq, !tbaa !1
%ln3sr = ptrtoint i8* @ghczmprim_GHCziTypes_ZMZN_closure to i64
%ln3ss = add i64 %ln3sr, 1
store i64 %ln3ss, i64* %R4_Var
%ln3st = load i64* %lc3rJ
store i64 %ln3st, i64* %R3_Var
store i64 0, i64* %R2_Var
%ln3su = load i64** %Sp_Var
%ln3sv = getelementptr inbounds i64* %ln3su, i32 -3
%ln3sw = ptrtoint i64* %ln3sv to i64
%ln3sx = inttoptr i64 %ln3sw to i64*
store i64* %ln3sx, i64** %Sp_Var
%ln3sy = bitcast i8* @base_GHCziShow_zdwshowSignedInt_info to void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)*
Post-optimized block (opt --enable-tbaa=true -O2 out-llvm-orig.ll -o out-llvm.bc
):
c3rB: ; preds = %c3rU
%ln3s8 = ptrtoint i8* %ln3s7 to i64
%ln3sd = getelementptr inbounds i64* %Sp_Arg, i64 -2
store i64 ptrtoint (i8* @stg_bh_upd_frame_info to i64), i64* %ln3sd, align 8, !tbaa !5
%ln3sg = getelementptr inbounds i64* %Sp_Arg, i64 -1
store i64 %ln3s8, i64* %ln3sg, align 8, !tbaa !5
store i64 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @"c3rQ_info$def" to i64), i64* %ln3rZ, align 8, !tbaa !5
tail call cc10 void bitcast (i8* @base_GHCziShow_zdwshowSignedInt_info to void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)*)(i64* %Base_Arg, i64* %ln3rZ, i64* %Hp_Arg, i64 %R1_Arg, i64 0, i64 -12367, i64 add (i64 ptrtoint (i8* @ghczmprim_GHCziTypes_ZMZN_closure to i64), i64 1), i64 undef, i64 undef, i64 %SpLim_Arg) #0
ret void
Folds it right into a constant!
I haven't spent time diagnosing this much further, yet.
Trac metadata
Trac field | Value |
---|---|
Version | 8.0.1 |
Type | Bug |
TypeOfFailure | OtherFailure |
Priority | high |
Resolution | Unresolved |
Component | Compiler (LLVM) |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | |
Operating system | |
Architecture |