Skip to content

Better sharing of join points

Here is a diagnosis of a major perf regression in text, discussed in this comment of #19557 (closed)

Diagnosis: join-point inlining

Suppose we have this (I know this isn't the way Stream is actually defined!):

data Stream a = MkStream (T a -> T a) !Size

and the definition

f x = let step = step-fun[x]
      in MkStream step (case blah of { A -> s1; B -> s2 })

where step-fun[x] is a big blob of code mentioning x.

Since MkStream is strict we'll get this:

f x = let step = step-fun[x]
      in join $j s = MkStream step s
      in case blah of
            A -> $j s1
            B -> $j s2

And since the RHS of $j is small, we inline it:

f x = let step = step-fun[x]
      in case blah of
            A -> MkStream step s1
            B -> MkStream step s2

Now suppose we have

g x = case f x of
        Stream ustep usize -> unstream-code[ustep,usize]

where unstream-code is a big blob of code mentioning step and size. We hope to inline f and eliminate the Stream:

g x = let step = step-fun[x]
      in case (case blah of { A -> MkStream step s1; B -> MkStream step s2 }) of
        Stream ustep usize -> unstream-code[ustep,usize]

Again case-of-case fires making a join point for the big unstream-code:

g x = let step = step-fun[x]
      in join $j ustep usize = unstream-code[ustep,usize]
      in case blah of
         A -> $j step s1
         B -> $j step s2

We have successfully cancelled the case on Stream with the Stream constructor. But DISASTER we have lambda-abstracted over step. In the body of $j, ustep is always step, but GHC doesn't see that, and hence can't inline step = step-fun[x] into $j. This completely messes up the benefits of fusion.

This is precisely what happens in the repro case for this ticket. The fact that it happens in GHC 9.0 but not in 8.10 is something of a fluke. All the transformations above are perfectly reasonable, and ones that 8.10 will do.

What do to about it

Consider this transformation, above:

      join $j s = MkStream step s
      in case blah of
            A -> $j s1
            B -> $j s2

===>  inline $j, since RHS is small

      case blah of
            A -> MkStream step s1
            B -> MkStream step s2

This looks innocuous enough, but suppose these two expressions are consumed by

  case <hole> of MkStream a b -> rhs[a,b]

If we put the two above expressions in the hole we'll get

      join $j2 a b = rhs[a,b]
      in join $j s = case MkStream step s of MkStream a b -> $j2 a b
      in case blah of
            A -> $j s1
            B -> $j s2
and
      join $j2 a b = rhs[a,b]
      in case blah of
            A -> case MkStream step s1 of MkStream a b -> $j2 a b
            B -> case MkStream step s2 of MkStream a b -> $j2 a b

Crucially, in the first one the join-point we created to avoid duplicating rhs is used only once, so we can inline it thus:

      join $j s = rhs[step,2]
      in case blah of
            A -> $j s1
            B -> $j s2

and lo! we can now inline step into rhs. Not so with the second, where $j2 appears twice.

So one idea is this:

  • In preInlineUnconditionally only inline a join point if
    • It occurs exactly once, or
    • Its RHS is trivial, or
    • Its RHS is a jump to another join point

Creating join points

We also need to take care when creating join points. E.g.

case (case blah of { p1 -> rhs1; ...; pn -> rhsn}) of
   (a,b)  -> MkStream step[a] s[b]

In Simplify.mkDupableAlt we current refrain from generating a join point if the case alternative is small. But here we want to generate

join $j a b = MkStream step[a] s[b]
in 
case blah of 
   p1 -> case rhs1 of (a,b) -> $j a b
   ...
   pn -> case rhsn  of (a,b) -> $j a b

for the reasons above.

Edited by Simon Peyton Jones
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information