Skip to content

Unlifted string literals are atomic in Stg but non-trivial in Core

Consider

f :: Maybe a -> Int
f x = case x of Nothing -> 0

leads to unoptimised STG

Lib.f :: forall a. GHC.Maybe.Maybe a -> GHC.Types.Int
[GblId, Arity=1, Unf=OtherCon []] =
    \r [x_sC7]
        case x_sC7 of {
          GHC.Maybe.Nothing -> GHC.Types.I# [0#];
          GHC.Maybe.Just _ [Occ=Dead] ->
              Control.Exception.Base.patError "test.hs:10:7-28|case"#;
        };

which apparently is a well-formed STG program, so the Addr# literal "test.hs:10:7-28|case"# is deemed atomic because it occurs in argument position. Makes sense to me, as it's quite easy to generate code for it. (Although now that I think about it, such a literal clearly has a referential identity unless we deduplicate in code gen/the linker. Do we??)

On the other hand, the "test.hs:10:7-28|case"# is not deemed trivial in Core, for reasons explained in the haddock of litIsTrivial, caused by #12757 (closed) (apparently fixed in 967dd5c9). The ultimate reason appears to be that we do not want to duplicate the literal; exprIsDupable should return False for it. But exprIsDupable doesn't even care about litIsTrivial, it calls litIsDupable instead! Although, Note [exprIsTrivial] says "is true of expressions we are unconditionally happy to duplicate".

There's a deep confusion here, IMO: Is exprIsTrivial supposed to govern duplicability (hence exprIsTrivial e ==> exprIsDupable e) or not?

To me, exprIsTrivial used to serve as a proxy for what is later considered an atomic arg in STG. I suppose, the implication exprIsTRivial e ==> stgExprIsAtomic (coreExprToStg e) still holds up, but this little inconsistency makes my life a bit uncomfortable in !10088 (closed), so I think we should sort this out.

I have a few mutually exclusive ideas:

  1. MAke sure to deplicate the produced strings in the linker ==> No address confusion / code bloat as in #12757 (closed) and we can treat string literals as trivial and duplicable
  2. Keep the status quo and document it
  3. Consider string literals non-atomic in STG (requiring tweaks to CorePrep), disallow non-toplevel LitString in STG, thus making sure that we don't accidentally duplicate literals in a future STG transformation. Maybe we are prone to duplicating string literals even today??

Opinions?

Edited by Sebastian Graf
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information