Better analysis of loop exits

Problem

Consider this code, which comes from bcp_generator in nofib/imaginary/paraffins, just before Phase=0 simplification.

(\a b. case a of                )
(        [] -> b                ) a (go xs)
(        (a:as) -> foldr k  b as)

This will beta-reduce to

let b = go xs
in case a of
     [] -> b
     (a:as) -> foldr k b as

Now because of Note [Inline small things to avoid creating a thunk] in GHC.Core.Opt.Simplify.Utils, we'll inline that b to give

case a of
  [] -> go xs
  (a:as) -> foldr k (go xs) as

We'd duplicated the (go xs) into two branches, but it's small, and we thereby avoid allocating it on the [] path.

But suppose we take the original expression and simplify the lambda first, inlining foldr. (The "make rules win" MR !8897 (merged) has precisely this effect.):

(\a b. case a of                                    )
(        [] -> b                                    ) a (go xs)
(        (a:as) -> letrec go vs = case vs of        )
(                                    [] -> b        )
(                                    (v:vs) -> ...  )
(                   in go as                        )

Now b occurs under a lambda, so Note [Inline small things to avoid creating a thunk] doesn't work; so we end up allocating b on both paths, rather than just in the (a:as) case. Although the saving is in the cold path, in the paraffins benchmark, that somehow meant that a letrec didn't turn into a joinrec -- I'm not quite sure why.

It turned out (again see !8897 (merged)) that a similar small regression in nofib/spectral/cryptarithm2 had precisely the same cause.

Opportunity

The optimiser should not be so fragile. Given:

  let exit = blah   -- Unnecessary allocation!
  in letrec go vs = case vs of
                     []     -> exit
                     (v:vs) -> ...   -- go is not necessarily tail-called
  in ...(go xs)...  -- Not necessarily a tail call!

we wan to spot that exit is evaluated only on the exit of the loop, and inline exit at its occurrence site:

  letrec go vs = case vs of
                     []     -> blah
                     (v:vs) -> ...
  in ...(go xs)...  -- Not necessarily a tail call!

This is higly reminiscent of what the Exitification pass does -- except that go is not a join point.

Idea 1. Surely demand analysis shoud spot that exit is evaluated at most once -- and that's why it can be inlined under a lambda. @sgraf812

Idea 2. Occurrence anaysis already identifies join points. Perhaps it could be modified to identify exits lithe one above, and switch off the "inside lambda" flag on their OccInfo.

The recursive calls don't have to be join points, but they must be saturarated, and no more than one in each branch of a case.
The calls in the body of the letrec don't have to be join points, but again they must be saturated, and no more than one in each branch of a case.

The occurrence of exit must be in tail position. E.g. this is no good:

let xit = blah
in letrec go xs = case xs of
                     [] -> e1
                     (x:xs) -> xit + go xs

Nota bene:

FloatOut might float blah out again if we aren't careful!
If we had Idea 2, maybe that would strictly subsume the entire exitification pass?

Edited Dec 07, 2022 by Simon Peyton Jones

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information