Skip to content

GHC inlining primitive string literals can affect program output

First noted in #11292 (closed), this program, when compiled with -O1 or higher:

{-# LANGUAGE MagicHash #-}
module Main (main) where

import GHC.Exts (Addr#, isTrue#)
import GHC.Prim (eqAddr#)

data A = A { runA :: Addr# }

a :: A
a = A "a"#

main :: IO ()
main = print (isTrue# (eqAddr# (runA a) (runA a)))

will result in the following after inlining:

Main.main2 =
  case GHC.Prim.tagToEnum#
         @ GHC.Types.Bool (GHC.Prim.eqAddr# "a"# "a"#)
  of _ [Occ=Dead] {
    GHC.Types.False -> GHC.Show.shows26;
    GHC.Types.True -> GHC.Show.shows24
  }

As a result, there are two of the same string constant with different addresses, which causes eqAddr# to return False. If compiled without optimizations, "a"# is not inlined, and as a result, eqAddr# returns True.

Two questions:

  1. Is this okay semantics-wise? Or is this a necessary risk when working with primitive string literals, and should programmers judiciously use {-# NOINLINE #-} with them?
  2. Is this okay from a code duplication standpoint? As Reid Barton noted in #11292 (closed), "a"# is duplicated due to inlining. In this example, not much is duplicated, but if it were a longer string constant, that could result in a noticeable increase in the object file size.
Trac metadata
Trac field Value
Version 7.10.3
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related #11292 (closed)
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information