Unlifted string literals are atomic in Stg but non-trivial in Core
Consider
f :: Maybe a -> Int
f x = case x of Nothing -> 0
leads to unoptimised STG
Lib.f :: forall a. GHC.Maybe.Maybe a -> GHC.Types.Int
[GblId, Arity=1, Unf=OtherCon []] =
\r [x_sC7]
case x_sC7 of {
GHC.Maybe.Nothing -> GHC.Types.I# [0#];
GHC.Maybe.Just _ [Occ=Dead] ->
Control.Exception.Base.patError "test.hs:10:7-28|case"#;
};
which apparently is a well-formed STG program, so the Addr#
literal "test.hs:10:7-28|case"#
is deemed atomic because it occurs in argument position. Makes sense to me, as it's quite easy to generate code for it. (Although now that I think about it, such a literal clearly has a referential identity unless we deduplicate in code gen/the linker. Do we??)
On the other hand, the "test.hs:10:7-28|case"#
is not deemed trivial in Core, for reasons explained in the haddock of litIsTrivial
, caused by #12757 (closed) (apparently fixed in 967dd5c9). The ultimate reason appears to be that we do not want to duplicate the literal; exprIsDupable
should return False
for it. But exprIsDupable
doesn't even care about litIsTrivial
, it calls litIsDupable
instead! Although, Note [exprIsTrivial]
says "is true of expressions we are unconditionally happy to duplicate".
There's a deep confusion here, IMO: Is exprIsTrivial
supposed to govern duplicability (hence exprIsTrivial e ==> exprIsDupable e
) or not?
To me, exprIsTrivial
used to serve as a proxy for what is later considered an atomic arg in STG. I suppose, the implication exprIsTRivial e ==> stgExprIsAtomic (coreExprToStg e)
still holds up, but this little inconsistency makes my life a bit uncomfortable in !10088 (closed), so I think we should sort this out.
I have a few mutually exclusive ideas:
- MAke sure to deplicate the produced strings in the linker ==> No address confusion / code bloat as in #12757 (closed) and we can treat string literals as trivial and duplicable
- Keep the status quo and document it
- Consider string literals non-atomic in STG (requiring tweaks to CorePrep), disallow non-toplevel
LitString
in STG, thus making sure that we don't accidentally duplicate literals in a future STG transformation. Maybe we are prone to duplicating string literals even today??
Opinions?