Skip to content

GHC and LLVM don't agree on what to do with byteSwap16#

Consider this test case (taken from here and lightly modified to work on big/little endian machines):

{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE MagicHash    #-}
{-# LANGUAGE CPP          #-}
module Main
  ( main -- :: IO ()
  ) where

#include "ghcconfig.h"

import           GHC.Prim
import           GHC.Word

data T = T !Addr#

t :: T
#ifndef WORDS_BIGENDIAN
t = T "\xcf\xb1"#
#else
t = T "\xb1\xcf"#
#endif

grabWord16 :: T -> Word64
grabWord16 (T addr#) = W64# (byteSwap16# (indexWord16OffAddr# addr# 0#))

trip :: Int
trip = fromIntegral (grabWord16 t)

main :: IO ()
main = print trip

With GHC 7.10.3 using the NCG, the results given are correct:

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.10.3
$ ghc -Wall -fforce-recomp -O2 Issue67.hs && ./Issue67
[1 of 1] Compiling Main             ( Issue67.hs, Issue67.o )
Linking Issue67 ...
53169

This also is the same on GHC 8.0.1 using the NCG, on both PowerPC and AMD64 as well. This answer is correct: 53169 is 0xCFB1 in hex, so the byteSwap16# primitive correctly works to decode the swapped-endian number.

However, the story is not the same with GHC 7.10.3+LLVM 3.5, or GHC 8.0.1+LLVM 3.7:

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.10.3
$ llc --version | head -2
LLVM (http://llvm.org/):
  LLVM version 3.5.2
$ ghc -Wall -fforce-recomp -O2 Issue67.hs -fllvm && ./Issue67
[1 of 1] Compiling Main             ( Issue67.hs, Issue67.o )
Linking Issue67 ...
-12367

Note:

-12367 == (fromIntegral (53169 :: Word16) :: Int16)

The relevant snippet looks like this at the CMM level (GHC 7.10.3):

==================== Output Cmm ====================
[section "data" {
     Main.main2_closure:
         const Main.main2_info;
         const 0;
         const 0;
         const 0;
 },
 section "readonly" {
     c3rq_str:
         I8[] [207,177]
 },
 section "readonly" {
     c3rr_str:
         I8[] [207,177]
 },
 Main.main2_entry() //  [R1]
         { info_tbl: [(c3ru,
                       label: Main.main2_info
                       rep:HeapRep static { Thunk }),
                      (c3rD,
                       label: block_c3rD_info
                       rep:StackRep [])]
           stack_info: arg_space: 8 updfr_space: Just 8
         }
     {offset
       c3ru:
           ...
       c3ro:
           I64[Sp - 16] = stg_bh_upd_frame_info;
           I64[Sp - 8] = _c3rn::I64;
           (_c3rw::I64) = call MO_BSwap W16(%MO_UU_Conv_W16_W64(I16[c3rr_str]));
           I64[Sp - 24] = c3rD;
           R4 = GHC.Types.[]_closure+1;
           R3 = _c3rw::I64;
           R2 = 0;
           Sp = Sp - 24;
           call GHC.Show.$wshowSignedInt_info(R4,
                                              R3,
                                              R2) returns to c3rD, args: 8, res: 8, upd: 24;
...

Pre-optimized LLVM basic block:

c3rB:
  %ln3sc = ptrtoint i8* @stg_bh_upd_frame_info to i64
  %ln3sb = load i64** %Sp_Var
  %ln3sd = getelementptr inbounds i64* %ln3sb, i32 -2
  store i64 %ln3sc, i64* %ln3sd, !tbaa !1
  %ln3sf = load i64* %lc3rA
  %ln3se = load i64** %Sp_Var
  %ln3sg = getelementptr inbounds i64* %ln3se, i32 -1
  store i64 %ln3sf, i64* %ln3sg, !tbaa !1
  %ln3sh = ptrtoint %c3rE_str_struct* @c3rE_str$def to i64
  %ln3si = inttoptr i64 %ln3sh to i16*
  %ln3sj = load i16* %ln3si, !tbaa !5
  %ln3sk = zext i16 %ln3sj to i64
  %ln3sl = trunc i64 %ln3sk to i16
  %ln3sm = call ccc i16 (i16)* @llvm.bswap.i16( i16 %ln3sl )
  %ln3sn = sext i16 %ln3sm to i64
  store i64 %ln3sn, i64* %lc3rJ
  %ln3sp = ptrtoint void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @c3rQ_info$def to i64
  %ln3so = load i64** %Sp_Var
  %ln3sq = getelementptr inbounds i64* %ln3so, i32 -3
  store i64 %ln3sp, i64* %ln3sq, !tbaa !1
  %ln3sr = ptrtoint i8* @ghczmprim_GHCziTypes_ZMZN_closure to i64
  %ln3ss = add i64 %ln3sr, 1
  store i64 %ln3ss, i64* %R4_Var
  %ln3st = load i64* %lc3rJ
  store i64 %ln3st, i64* %R3_Var
  store i64 0, i64* %R2_Var
  %ln3su = load i64** %Sp_Var
  %ln3sv = getelementptr inbounds i64* %ln3su, i32 -3
  %ln3sw = ptrtoint i64* %ln3sv to i64
  %ln3sx = inttoptr i64 %ln3sw to i64*
  store i64* %ln3sx, i64** %Sp_Var
  %ln3sy = bitcast i8* @base_GHCziShow_zdwshowSignedInt_info to void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)*

Post-optimized block (opt --enable-tbaa=true -O2 out-llvm-orig.ll -o out-llvm.bc):

c3rB:                                             ; preds = %c3rU
  %ln3s8 = ptrtoint i8* %ln3s7 to i64
  %ln3sd = getelementptr inbounds i64* %Sp_Arg, i64 -2
  store i64 ptrtoint (i8* @stg_bh_upd_frame_info to i64), i64* %ln3sd, align 8, !tbaa !5
  %ln3sg = getelementptr inbounds i64* %Sp_Arg, i64 -1
  store i64 %ln3s8, i64* %ln3sg, align 8, !tbaa !5
  store i64 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @"c3rQ_info$def" to i64), i64* %ln3rZ, align 8, !tbaa !5
  tail call cc10 void bitcast (i8* @base_GHCziShow_zdwshowSignedInt_info to void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)*)(i64* %Base_Arg, i64* %ln3rZ, i64* %Hp_Arg, i64 %R1_Arg, i64 0, i64 -12367, i64 add (i64 ptrtoint (i8* @ghczmprim_GHCziTypes_ZMZN_closure to i64), i64 1), i64 undef, i64 undef, i64 %SpLim_Arg) #0
  ret void

Folds it right into a constant!

I haven't spent time diagnosing this much further, yet.

Trac metadata
Trac field Value
Version 8.0.1
Type Bug
TypeOfFailure OtherFailure
Priority high
Resolution Unresolved
Component Compiler (LLVM)
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information