Better sharing of join points
Here is a diagnosis of a major perf regression in text
, discussed in this comment of #19557 (closed)
Diagnosis: join-point inlining
Suppose we have this (I know this isn't the way Stream is actually defined!):
data Stream a = MkStream (T a -> T a) !Size
and the definition
f x = let step = step-fun[x]
in MkStream step (case blah of { A -> s1; B -> s2 })
where step-fun[x]
is a big blob of code mentioning x.
Since MkStream
is strict we'll get this:
f x = let step = step-fun[x]
in join $j s = MkStream step s
in case blah of
A -> $j s1
B -> $j s2
And since the RHS of $j is small, we inline it:
f x = let step = step-fun[x]
in case blah of
A -> MkStream step s1
B -> MkStream step s2
Now suppose we have
g x = case f x of
Stream ustep usize -> unstream-code[ustep,usize]
where unstream-code
is a big blob of code mentioning step
and size
.
We hope to inline f
and eliminate the Stream
:
g x = let step = step-fun[x]
in case (case blah of { A -> MkStream step s1; B -> MkStream step s2 }) of
Stream ustep usize -> unstream-code[ustep,usize]
Again case-of-case fires making a join point for the big unstream-code
:
g x = let step = step-fun[x]
in join $j ustep usize = unstream-code[ustep,usize]
in case blah of
A -> $j step s1
B -> $j step s2
We have successfully cancelled the case on Stream
with the Stream
constructor.
But DISASTER we have lambda-abstracted over step
. In the body of $j
,
ustep
is always step
, but GHC doesn't see that, and hence can't inline
step = step-fun[x]
into $j
. This completely messes up the benefits of fusion.
This is precisely what happens in the repro case for this ticket. The fact that it happens in GHC 9.0 but not in 8.10 is something of a fluke. All the transformations above are perfectly reasonable, and ones that 8.10 will do.
What do to about it
Consider this transformation, above:
join $j s = MkStream step s
in case blah of
A -> $j s1
B -> $j s2
===> inline $j, since RHS is small
case blah of
A -> MkStream step s1
B -> MkStream step s2
This looks innocuous enough, but suppose these two expressions are consumed by
case <hole> of MkStream a b -> rhs[a,b]
If we put the two above expressions in the hole we'll get
join $j2 a b = rhs[a,b]
in join $j s = case MkStream step s of MkStream a b -> $j2 a b
in case blah of
A -> $j s1
B -> $j s2
and
join $j2 a b = rhs[a,b]
in case blah of
A -> case MkStream step s1 of MkStream a b -> $j2 a b
B -> case MkStream step s2 of MkStream a b -> $j2 a b
Crucially, in the first one the join-point we created to avoid duplicating rhs
is used
only once, so we can inline it thus:
join $j s = rhs[step,2]
in case blah of
A -> $j s1
B -> $j s2
and lo! we can now inline step
into rhs
. Not so with the second, where $j2
appears twice.
So one idea is this:
- In
preInlineUnconditionally
only inline a join point if- It occurs exactly once, or
- Its RHS is trivial, or
- Its RHS is a jump to another join point
Creating join points
We also need to take care when creating join points. E.g.
case (case blah of { p1 -> rhs1; ...; pn -> rhsn}) of
(a,b) -> MkStream step[a] s[b]
In Simplify.mkDupableAlt
we current refrain from generating a join point if the case alternative is small. But here we want to generate
join $j a b = MkStream step[a] s[b]
in
case blah of
p1 -> case rhs1 of (a,b) -> $j a b
...
pn -> case rhsn of (a,b) -> $j a b
for the reasons above.