Clarify and document the CBV-convention
I'd like to clarify the role of
CbvMarks
-
OtherCon []
unfoldings - Strictness signatures
- Strict data constructors
for a strict function. Relevant tickets
- #21472 (closed), !8148 (closed)
- #14626 (closed)
- #14677 (closed)
- #15155 (closed)
- #21496 (closed)
- #14626 (closed): this relates to wether a top-level constructor is exported properly-tagged.
- #15696 (closed): how untagged pointers can end up in strict constructor fields
- #16559 (closed)
- #14677 (closed)
- #20749
- #22475 (closed)
The olden days
Until a few months ago, even if a function was strict, it could not guarantee that the caller would pass it an evaluated argument, or that (even if it did) that value would be properly tagged.
For example, reverse
is strict in its argument, but we might have
a call
f A xs = []
f B xs = reverse xs
f C xs = reverse (take 2 xs)
Now f
is not strict in xs
(see the A
case), and our long-standing convention is that
f
can just pass xs
on to reverse
in the B
case. But in the C
case we will use
call-by-value to avoid building a thunk, thus
f C xs = case take 2 xs of xs' -> reverse xs'
Bottom line: reverse
cannot guarantee to have an evaluated argument. It might
be passed a thunk.
Passing properly tagged pointers
See
-
Note [Tag Inference]
in GHC.Stg.InferTags -
Note [How untagged pointers can end up in strict fields]
in GHC.Stg.InferTags, and #15696 (closed) -
Note [Strict Worker Ids]
in GHC.Types.Id.Info -
Note [Attaching CBV Marks to ids]
in GHC.Core.Tidy
In the work on tag inference (pointers above), we allow a function to establish a new calling convention: that it must be given a properly tagged pointer (to a value) as its argument. Let's call this the CBV-convention for a particular argument. Notice that
- A function is marked strict in an argument if the function itself is sure to evaluate the argument: strictness says something about the function's semantics.
- A function that uses a CBV-convention for an argument requires that argument to be passed as a properly-tagged value: it is a requirement on the caller.
Strict data constructors (or more precisely the worker for a strict data constructor) have, and must have, a CBV convention for their strict arguments.
It is possible for a strict data constructor not to have a properly-tagged
argument; see Note [How untagged pointers can end up in strict fields]
in GHC.Stg.InferTags, and #15696 (closed). To establish the invariant that struid data constructors always have properly-tagged arguments, the code generator adds extra evals around a strict data constructor, when necessary. This is done in GHC.Stg.InferTags.Rewrite.rewriteTopBinds
.
OtherCon []
unfoldings
What does a OtherCon[] unfolding mean? It means at every use site of the binder, the binder will represent a value in WHNF.
This means it's very tempting to give workers a OtherCon unfolding for lambda binders which represent strict fields.
Since it seems reasonable to assume the field has been evaluated before constructor allocation.
This would allow a seq
on that argument to be dropped.
This is an attractive idea -- after all, the caller should be guaranteeing that it is called by value! But it proves to be unsustainable, as we will see.
However that seq
might be the only thing that makes that function strict.
e.g.
f x = x `seq` Just x
If we decide to give f
a CBV-convention, we might transform to
f x{-Unf=OtherCon []-} = x `seq` Just x
and then drop the seq
, giving
f x{-Unf=OtherCon []-} = Just x
That is fine provide all callers actually use call-by-value. But #21496 (closed) highlights that this is a very problematic assumption and #21460 established that this indeed can go wrong in practice!
So we decided to remove OtherCon
from all lambda binders.
This means the OtherCon
invariant than rather becomes:
For a binder with an OtherCon
unfolding at every use site the binder will represent a value in WHNF.
We may not give lambda binders a OtherCon
unfolding because it's impractical to uphold this invariant across function calls. See the proposals below for details.
Why we need CbvMarks at all?
Why do we need CbvMark
s on any functions at all? Here's an example:
data T = MkT ![Int] Bool
f (MkT xs b) = MkT xs <big-expression>
Just before demand analysis we have this:
f t = case t of MkT xs b -> $WMkT xs <big>
(Remember, we have not yet inlined data constructor wrappers.) Then demand analysis does w/w:
$wf xs b = $WMkT xs <big>
f t = case t of MkT xs b -> $wf xs b
Now we inline that $WMkT
:
$wf xs b = case xs of xs' -> MkT xs <big>
Now, alas, we have no way to eliminate that case xs
, even though
that argument of $wf
came straight out of the incoming MkT
.
Sad. And this really happens in code that takes apart and re-assembles
data structures with strict fields.
So we want to make $wf
into a CBV-convention function (like the worker
MkT
itself), so that
- Inside
$wf
we know that we'll be passed a non-bottom, properly tagged value, so we can make thecase xs
a no-op at runtime. - The callers of
$wf
are responsible for guaranteeing that claim.
Something similar happens in SpecConstr. Suppose we have
foo :: T -> blah
foo t = rhs
baz = ...foo (MkT p q)...
Now SpecConstr specialise foo
, thus
$sfoo xs b = let t = MkT xs b in rhs
RULE forall xs b. foo (MkT xs b) = $sfoo xs b
Here again, the xs
argument to $sfoo
comes straight from a MkT
,
and we don't want to do extra evals on it in the body of $sfoo
.
(I can't quite cook up a specific example here, but I'd like to see one. Note [SpecConstr and evaluated unfoldings]
doesn't give an actual example.)
CbvMarks and StrictWorkerIds
Currently the notion of a CBV-parameter is very squishy. Note [Attaching CBV Marks to ids]
in GHC.Core.Tidy says:
- Before tidy, function arguments which have a call-by-value semantics are identified
by having an
OtherCon[]
unfolding. - During tidy, we transform this information into CBV (call-by-value) marks. The marks themselves then are put onto the function id itself.
So we have two quite different ways to signal CBV-parameters.
When the simplifier inlines a call to a function with a CBV-parameter (either imported, in which case it'll be signalled in the CbvMarks, or local, in which case it'll be signalled in the unfolding), it must be careful not to lose the "eval". I'm far from certain that we are careful.
In some way this can lead to a crash: see #21472 (closed). Sebastian gives a reproducer here.
Putting OtherCon
unfoldings on a lambda binder is extremely flaky, even if the call site
is passing something that comes directly from a strict constructor field.
Suppose
data T = MkT ![Int] Bool
f :: T -> T
f (MkT x b) = MkT x True
So (assuming we do a w/w split) we will end up with
$wf x* = $WMkT x True
f t = case t of MkT x _ -> $wf x
where x*
means "has an OtherCon []
unfolding, and $WMkT
is the wrapper for MkT
that does the eval.
(I'm assuming here that strict constructors have wrappers, as today.)
Now we inline $WMkT
and optimise, dropping the evals (because of that flaky OtherCon []
:
$wf x* = MkT x True
Now consider this call to f
:
f ($WMkT ex eb)
{inline $WMkT} --> case ex of x -> f (MkT x eb)
{inline f} --> case ex of x -> case MkT x eb of MkT x _ -> $wf x
{case of known} --> case ex of x -> $wf x
{eval-of-strict} --> $wf ex
{inline $wf} --> MkT ex True
We have lost the eval. It will get put back the the code generator, but all our faffing with OtherCon []
bought us nothing.
Proposals
So much for the backgound. Here is our proposed new design. NB: parts of it are the same as the current design, but we re-state it here (signalling that it's the status quo) for completeness.
Strict data constructors: no change
We propose to make no change in the way that strict data constructors are handled. Consider this example:
data T = MkT ![Int] Bool
Currently we do this:
-
Make a wrapper
$WMkT x y = case x of x' -> MkT x' y
-
MkT
's worker (the data constructor itself) is lazy, not strict, in all its arguments. The evals are done by the wrapper; the worker is passive. This is important; if we don't do this, the simplifier will drop thecase x
in$WMkT
becausex'
is used strictly byMkT
. -
GHC.CoreToStg.Prep can use eta-expansion to ensure that all strict data constructor applications are either saturated, or applied to no arguments at all (
maybeSaturate
). (Side note:Note [CorePrep Overview]
point (1) claims that Prep saturates all data constructor and primop applications, but that isn't true.) -
GHC.CoreToStg.Prep also adds a binding
MkT = \x y. MkT x y
so that the "no arguments at all" case has something to call (see
mkDataConWorkers
). -
In the code generator (or, rather, late STG), for every call to the worker
MkT
, we wrap an eval for the first argument if we can't statically see that the first argument is evaluated and properly tagged. This is done byGHC.Stg.InferTags.Rewrite.rewriteTopBinds
.
While there is no change here, it does mean that we have a somewhat-subtle Core invariant that we should document (#20749):
- In every Core program, it should be the case that replacing every constructor
application
(MkT xs b)
withcase xs of xs' -> MkT xs' b
does not change the meaning of the program; that is,xs
is non-bottom. (In general, add acase
for each strict field of the constructor.)
We establish the invariant by only introducing MkT
via a call to its
wrapper $WMkT
, which evaluates the argument. That ensures that the
strict field is non-bottom; and all further optimisation should
maintain that semantics.
Although the invariant is easy to establish, it is (sadly) not possible for Lint
to check it. In principle, a buggy Core-to-Core pass could break the invariant,
by introducing a call to the worker MkT
that does not satisfy the invariant.
That is a shortcoming of the plan described here.
An attractive-seeming alternative would be to make the constructor worker strict in its strict argument(s). It's attractive because it might seem that the evals in the wrapper are mutually redundant with the evals added by the code generator. But this approach had serious problems; see "Failed idea: no wrappers for strict data constructors" below.
OtherCon []
unfoldings
Proposal 0: Proposal: Never put an OtherCon []
unfolding on a binder. Indeed only let-binders have an Unfolding; no other binders have an unfolding at all. Rather OtherCon []
is added by the simplifier to occurrences, when in a
scope where that occurrence is konwn to be evaluated. Example:
data TD = A | B | C
case x of y
{ A -> ...(case y of { A -> e1; _ -> e2 })...
; _ -> ...(y `seq` blah)...
- In the
A
branch,y
's occurrences have an unfolding ofA
; that allows us to simplify the innercase y
. - In the
_
branch,y
's occurrences have an unfoldin ofOtherCon [A]
saying thaty
is notA
; that allows us to drop theseq
.
We use this for strict data constructors too:
data T = MkT ![Int] | A | B | C
foo x = case x of y
MkT p -> rhs1
_ -> rhs2
Here,
- In the
_
branch,y
gets anOtherCon [MkT]
unfolding. - In the
MkT
branch,y
gets an unfolding ofMkT p
; andp
gets anOtherCon []
unfolding.
But no binder has an OtherCon []
unfolding; they are purely for occurrences.
Proposal 1: CBV arguments
-
Completely get rid of
StrictWorkerId
andCbvMarks
in the simplifier. -
In the code generator, establish the following calling convention: for "CBV-Ids", every strict, lifted argument is passed as an evaluated, properly-tagged pointer. See below for what a "CBV-Id" is.
-
Whatever we do, we must ensure that the decisions we make about top-level functions are exposed in the interface file, regardless of optimisation level. If a function uses CBV, the caller must honour that calling convention. It's not optional! See #21676
-
GHC.Stg.InferTags.Rewrite.rewriteTopBinds
ensures that all calls meet this calling convention. Supposef
is a CBV-Id, with arity 2 and CbvMarks[MarkedCbv, NotMarkedCbv]
. Then the rewriter- Adds an eval to a saturated call
(f x y)
, thus(case x of x2 -> f x2 y)
unless the rewriter can prove thatx
is properly tagged. - Eta-expand unsaturated calls to
f
; thusf
becomes(\xy. case x of x2 -> f x2 y)
. - Calls of non-CBV-Ids are unaffected, as are CBV-Ids that have no strict boxed arguments.
- Adds an eval to a saturated call
-
When generating code for
-
case x of { p1 -> e1 ... }
: no need to eval ifx
is known to be evaluated and properly tagged -
case x of y -> rhs
: do nothing at all ifx
is known to be evaluated and properly tagged
-
-
I am no longer sure of the role of the tag inference pass. (To discuss.)
-
What Ids should be "CBV-Ids"?. NB: this is a free choice. Everything will work correctly if we say "all" or "none" or anything in between.
One possiblility is "all Ids are CBV-Ids". That is simple. But Andreas (in the dim and distant past) tried (something like) this and found that he got a lot of code bloat from eta-expansion of CBV-Ids.
So instead we propose these Ids as CBV-Ids:
- Constructor workers: this one is actually vital to maintain the invariant that the argument of a strict constructor worker is properly tagged.
- Join points; these are always saturated, so no eta-expansion worries
- Worker-like Ids: workers from worker/wrappr, specialised functions from SpecConstr. Again always saturated --- at least when they are born.
And that's about it. It's simple, catches all the cases that we are currently worried about.
The second bullet of "when generating code" effectively drops redundant cases altogether; but we only
do it at this late stage, not driven by flaky OtherCon []
unfoldings.
Proposal 2: absent fillers for strict fields
Now that we have removed OtherCon
from lambda binders (see "Proposal: CBV arguments" above), we risk producing code which fails with absentError
panics at times.
This is because W/W currently uses OtherCon
as a way to determine that a binder represents a strict field. See
eee498bf and Note [Absent fillers]
2):
Obviously this approach is no longer possible after OtherCon
is no longer present on these binders! Here is the proposal:
In worker/wrapper, use non-bottom absent fillers for absent, banged fields.
Despite the current note in W/W saying this is really nasty it turned out to be not hard at all.
Taking the code from Note [Absent fillers]
this means for this code:
data T = MkT [Int] [Int] ![Int] -- NB: last field is strict
f :: T -> Int# -> blah
f ps w = case ps of MkT xs ys zs -> <body mentioning xs>
Then f gets a strictness sig of <S(L,A,A)>. But we thread through the bangs of the fields so we make a worker $wf
-- RHS size: {terms: 25, types: 21, coercions: 0, joins: 0/2}
$wf_s14m [InlPrag=NOINLINE] :: [Int] -> Int#
[LclId[StrictWorker([])], Arity=1, Str=<SL>]
$wf_s14m
= \ (ww_s14f [Dmd=SL] :: [Int]) ->
case let {
ps_awV [Dmd=S!P(SL,A,A)] :: T
[LclId]
ps_awV
= M.MkT ww_s14f
(GHC.Prim.Panic.absentError @[Int] ...)
(RUBBISH(LiftedRep) @[Int])
} in
let {
w_awW [Occ=Dead, Dmd=A] :: Int#
[LclId]
w_awW = RUBBISH('IntRep) @Int# } in
case ps_awV of
{ MkT xs_awX [Dmd=SL] ys_awY [Dmd=A] zs_awZ [Dmd=A] ->
case GHC.List.$wlenAcc @Int xs_awX 0# of ww1_a14b { __DEFAULT ->
GHC.Types.I# ww1_a14b
}
}
of ww_s14k
{ __DEFAULT ->
case ww_s14k of wild_00 { I# ww_s14n -> ww_s14n }
}
The trick here is that when we make the filler for zs
we check if there was a bang on this field
and if so we create RUBBISH(LiftedRep) @[Int]
instead of an absentError like for the lazy field ys
This means we can still drop the original zs
early potentially freeing memory retained by it and we don't have to pass it as an argument either.
The downside of this approach is if absence analysis is actually wrong then zs
won't fall over with a panic and instead we get segfaults/incorrect results.
I will implement this as part of !8148 (closed)
seq
s in the worker
Proposal 3a: add Consider
f :: T -> Maybe T
f t = case t of MkT xs b -> Just (MkT xs (g b))
Strictness analysis will worker/wrapper to
$wf xs b = let t = MkT xs b in case t of MkT xs b -> Just (MkT xs (g b))
f t = case t of MkT xs b -> $wf xs b
We optimise $wf
to
$wf xs b = Just (MkT xs <blah>)
but alas now
-
$wf
is no longer strict inxs
(rememberMkT
is not strict; and in any case theJust
makes it lazy). - The code generator will add an eval for
xs
around the call toMkT
.
Proposal: during w/w, add an eval, in the worker, for used arguments of strict data constructors:
$wf xs b = case xs of xs' ->
let t = MkT xs' b in
case t of MkT xs b -> Just (MkT xs <blah>)
Now after optimising we get
$wf xs b = case xs of xs' -> Just (MkT xs' <blah>)
Now $wf
is visibly strict in xs
, so:
- the code generator will not add an eval around
MkT
, since there is an enclosing eval - the code generator will make
$wf
into a CBV-Id; and hence will drop thecase xs
in$wf
because the caller will guarantee it is evaluated. See "Proposal: CBV arguments".
seq
s in the specialised function
Proposal 3b: SpecConstr: We should add extra evals for strict arguments in SpecConstr, just as in worker/wrapper.
f t = ....(case t of MkT xs b -> blah)....
Suppose SpecConstr decides to make the following RULE
RULE forall xs b. f (MkT xs b) = $sf xs b
Then the specialised function should have the extra evals:
$sf xs b = case xs of xs' ->
let t = MkT xs b in
...(case t of MkT xs b -> blah)....
(This is sound becuase in the call f (MkT xs b)
we know that xs
is non-bottom.)
This $sf
optimises to
$sf xs b = case xs of xs -> ...(blah)....
and again any uses of xs
in blah
can see that xs
is evaluated. Again, $sf
is strict
so (see "Proposal: CBV arguments") we'll use a CBV convention, and end up dropping the case
.
seq
s in join points arising from case alternatives
Proposal 3c: Eaxctly the same deal for join points: add extra evals for strict arguments. Suppose we have
case e of MkT xs y -> rhs
and we want to make a join point:
join $j xs y = case xs of xs -> rhs in
case e of MkT xs y -> $j xs y
Failed ideas
Failed idea: no wrappers for strict data constructors
data T = MkT ![Int]
At one point I proposed, for
- Not have a wrapper
$WMkT
at all - Mark the worker
MkT
as strict in its first argument. More precisely, in demand analysis changedmdTransformDataConSig
to add a 1A demand on strict fields.
Having a wrapper to add evals seems redundant, becuause the code generator will add them anyway. But it isn't! Suppose we do the case-of-known-constructor transformation:
case MkT x of MkT p -> True ==> True
If MkT
is strict in x
, then the LHS is strict in x
; but the RHS is not.
We'd have to change the transformation to add evals. Similarly case-to-let
case MkT x of r -> Just r ==> let r = MkT x in Just r
Again, the LHS is strict in x
but the RHS is not.
There is similar knock-on effect on exprIsHNF
, exprOkForSpeculation
etc.
In the end, the no-wrapper simplicity seems well outweighed by the extre complexity in multiple other transformations. The fact that constructors are lazy is rather deeply built into GHC!
absentError
Proposal: Give constructor workers always at least a used demand on their strict fields
Alternative Taking the code from Note [Absent fillers]
this means for this code:
data T = MkT [Int] [Int] ![Int] -- NB: last field is strict
f :: T -> Int# -> blah
f ps w = case ps of MkT xs ys zs -> <body mentioning xs>
Then f gets a strictness sig of <S(L,A,L)>.
Since the strict field will no longer be absent we don't generate a filler at all and instead just use the actual field content. This means we will produce a worker like this:
-- RHS size: {terms: 25, types: 21, coercions: 0, joins: 0/2}
$wf_s14m [InlPrag=NOINLINE] :: [Int] -> [Int] -> Int#
[LclId[StrictWorker([])], Arity=2, Str=<SL><L>]
$wf_s14m
= \ (ww_s14f [Dmd=SL] :: [Int])
(ww_zs [Dmd=L] :: [Int]) ->
case let {
ps_awV [Dmd=S!P(SL,A,A)] :: T
[LclId]
ps_awV
= M.MkT ww_s14f
(GHC.Prim.Panic.absentError ...)
ww_zs
} in
let {
w_awW [Occ=Dead, Dmd=A] :: Int#
[LclId]
w_awW = RUBBISH('IntRep) @Int# } in
case ps_awV of
{ MkT xs_awX [Dmd=SL] ys_awY [Dmd=A] zs_awZ [Dmd=A] ->
case GHC.List.$wlenAcc @Int xs_awX 0# of ww1_a14b { __DEFAULT ->
GHC.Types.I# ww1_a14b
}
}
of ww_s14k
{ __DEFAULT ->
case ww_s14k of wild_00 { I# ww_s14n -> ww_s14n }
}
After some reflection I think this isn't a great idea because:
- We will retain zs longer.
- We pass more arguments to the worker for no good reason.
- It makes the strictness signature less precise.
The upside of this option is that if absence analysis is wrong we are more likely to actually get an absentError instead of a segfault.
If we really want to ensure we get an absentError
in case absence error on a strict field goes wrong then we can implement the alternative absentError
proposal in the future.