Do more inlining into non-recursive join points
In pursuing #14068, Joachim says found this code:
let {
ds5 :: [[Int]]
ds5 = case ==# x1 ww1 of {
__DEFAULT -> go1 (+# x1 1#);
1# -> n
} } in
join {
lvl6 :: [[Int]]
lvl6 = (ds4 : y) : ds5} in
joinrec {
safe :: Int -> Int -> [Int] -> [[Int]]
safe (x2 :: Int) (d :: Int) (ds6 :: [Int])
= case ds6 of {
[] -> jump lvl6;
: q l ->
case x2 of wild6 { I# x3 ->
case q of { I# y1 ->
case /=# x3 y1 of {
__DEFAULT -> ds5;
1# ->
case d of { I# y2 ->
case /=# x3 (+# y1 y2) of {
__DEFAULT -> ds5;
1# ->
case /=# x3 (-# y1 y2) of {
__DEFAULT -> ds5;
1# ->
jump safe
wild6 (I# (+# y2 1#)) l
}}}}}}
}; } in
jump safe ds4 lvl y; } in ...
This is terrible! lvl6 is a join point, but ds5 is not. And it's easy to fix: we should simply float ds5 into lvl6's rhs.
- *Backgound**. In general, if GHC sees this
-- Version A
x = f x : g y
it'll float out the (f x) and (g y) parts thus (B):
-- Version B
a1 = f x
a2 = g y
x = a1 : a2
Reasons:
- In the scope of this binding we might have
case x of (a:b) -> rhs, and the new form allows us to eliminate the case. - Version A builds a thunk (which, when eval'd) builds thunks for
(f x)and(g y)and returns a cons; whereas Version B builds the two thunks ahead of time and allocates a cons directly. Ifxgets evaluated this is a win, by avoiding an extra thunk. We measured this in our let-floating paper, and B is much better on average.
But if x is a join point, all this is wrong wrong wrong:
- A join point is never scrutinised by a nested case, so there is no benefit in floating.
- A join point is not implemented by a thunk, so there is no thunk alloc/update to avoid.
In fact, for join points we want to turn Version B into Version A!
Specifically:
-
In
Simplify, do not callprepareRhson join RHSs; nor do floating (seetick LetFloatFromLet) from the RHS of a join. In fact this seems to be already the case. -
In
OccurAnal, do not userhsCtxtfor the RHS of a non-recursive join point (seeoccAnalNonRecRhs). Setting this context makesa1anda2get "inside-lam" occurrences (seeoccAnalApp), and that in turn stopsa1anda2getting inlined straight back intox's RHS in Version B. But for join points we **want** them to be inlined, to get back to Version A.For a recursive join point we probably do not want to inline
a1anda2because then they'd be allocated each time round the loop. But in fact we can't have a recursive join point whose RHS is just a cons, so it doesn't really arise. The point is: we only need to take care withrhsCtxtfor non-recursive join points.
Bottom line: just that one fix to OccurAnal should do the job.
Trac metadata
| Trac field | Value |
|---|---|
| Version | 8.2.1 |
| Type | Bug |
| TypeOfFailure | OtherFailure |
| Priority | normal |
| Resolution | Unresolved |
| Component | Compiler |
| Test case | |
| Differential revisions | |
| BlockedBy | |
| Related | |
| Blocking | |
| CC | |
| Operating system | |
| Architecture |