Skip to content

Do more inlining into non-recursive join points

In pursuing #14068, Joachim says found this code:

let {                                           
  ds5 :: [[Int]]                                
  ds5 = case ==# x1 ww1 of {                      
        __DEFAULT -> go1 (+# x1 1#);            
        1# -> n                                 
      } } in                                    
join {                                          
  lvl6 :: [[Int]]                               
  lvl6 = (ds4 : y) : ds5} in     
joinrec {                                       
  safe :: Int -> Int -> [Int] -> [[Int]] 
  safe (x2 :: Int) (d :: Int) (ds6 :: [Int])    
    = case ds6 of {                             
        [] -> jump lvl6;                        
        : q l ->                                
          case x2 of wild6 { I# x3 ->           
          case q of { I# y1 ->                  
          case /=# x3 y1 of {                   
            __DEFAULT -> ds5;                   
            1# ->                               
              case d of { I# y2 ->              
              case /=# x3 (+# y1 y2) of {       
                __DEFAULT -> ds5;               
                1# ->                           
                  case /=# x3 (-# y1 y2) of {   
                    __DEFAULT -> ds5;           
                    1# ->                       
                      jump safe                 
                        wild6 (I# (+# y2 1#)) l 
                  }}}}}}
      }; } in                                   
jump safe ds4 lvl y; } in ...

This is terrible! lvl6 is a join point, but ds5 is not. And it's easy to fix: we should simply float ds5 into lvl6's rhs.

  • *Backgound**. In general, if GHC sees this
-- Version A
x = f x : g y

it'll float out the (f x) and (g y) parts thus (B):

-- Version B
a1 = f x
a2 = g y
x = a1 : a2

Reasons:

  1. In the scope of this binding we might have case x of (a:b) -> rhs, and the new form allows us to eliminate the case.
  2. Version A builds a thunk (which, when eval'd) builds thunks for (f x) and (g y) and returns a cons; whereas Version B builds the two thunks ahead of time and allocates a cons directly. If x gets evaluated this is a win, by avoiding an extra thunk. We measured this in our let-floating paper, and B is much better on average.

But if x is a join point, all this is wrong wrong wrong:

  1. A join point is never scrutinised by a nested case, so there is no benefit in floating.
  2. A join point is not implemented by a thunk, so there is no thunk alloc/update to avoid.

In fact, for join points we want to turn Version B into Version A!

Specifically:

  • In Simplify, do not call prepareRhs on join RHSs; nor do floating (see tick LetFloatFromLet) from the RHS of a join. In fact this seems to be already the case.

  • In OccurAnal, do not use rhsCtxt for the RHS of a non-recursive join point (see occAnalNonRecRhs). Setting this context makes a1 and a2 get "inside-lam" occurrences (see occAnalApp), and that in turn stops a1 and a2 getting inlined straight back into x's RHS in Version B. But for join points we **want** them to be inlined, to get back to Version A.

    For a recursive join point we probably do not want to inline a1 and a2 because then they'd be allocated each time round the loop. But in fact we can't have a recursive join point whose RHS is just a cons, so it doesn't really arise. The point is: we only need to take care with rhsCtxt for non-recursive join points.

Bottom line: just that one fix to OccurAnal should do the job.

Trac metadata
Trac field Value
Version 8.2.1
Type Bug
TypeOfFailure OtherFailure
Priority normal
Resolution Unresolved
Component Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
Edited by Joachim Breitner
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information