Skip to content

GHC optimizes Addr# literals to the same

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE Strict #-}
{-# OPTIONS_GHC -ddump-simpl #-}

import Data.Semigroup
import Data.ByteString.Builder
import Data.ByteString.Internal (accursedUnutterablePerformIO)
import Data.ByteString.Lazy (putStr)
import Data.ByteString.Short (ShortByteString)
import Data.ByteString.Short.Internal (createFromPtr)
import GHC.Exts

data DataA = DataA
  { val1 :: Int,
    val2 :: Int
  }
  deriving (Eq, Show)

toEncodingInt :: Int -> Builder
toEncodingInt = intDec

toEncoding :: DataA -> Builder
toEncoding = let
    xn = unsafePackLenLiteral 7 "\"val1\":"#
    yn = unsafePackLenLiteral 7 "\"val2\":"#
  in \ value_af1X -> case value_af1X of
           DataA x y -> "{" <> shortByteString xn <> toEncodingInt x <> "," <> shortByteString yn <> toEncodingInt y <> "}"

unsafePackLenLiteral :: Int -> Addr# -> ShortByteString
unsafePackLenLiteral len addr# =
    accursedUnutterablePerformIO $ createFromPtr (Ptr addr#) len

main :: IO ()
main =
    Data.ByteString.Lazy.putStr $ toLazyByteString $ toEncoding $ DataA 1 2

When runghcd or compiled without optimizations, this program prints {"val1":1,"val2":2} as one would expect. However when optimized it prints {"val1":1,"val1":2} (val1 twice!).

Why?

Some observations:

  • Removing {-# LANGUAGE Strict #-} makes program work as expected (Also making bindings lazy by adding tilde ~ in front).
  • Changing either literal to be of different length makes program work as expected too. (E.g. changing to yn = unsafePackLenLiteral 8 "\"val2x\":"#).
  • Using unsafePerformIO instead of accursedUnutterablePerformIO works also.

I'd like to understand whether this is a GHC bug, or I'm doing something wrong, What? Why Strict triggers wrong behaviour?).

That is important as bytestring provides unsafePackageLenLiteral implemented as above, using accursedUnutterablePerformIO. cc @Bodigrim

Originally reported in https://github.com/haskell/aeson/issues/967

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information