DmdAnal: A lazy Demand's SubDemand should be relative to at least one evaluation

This is take 2 on #18885 (closed), which I'll close in favor of this new summary with the insights we gained since then.

Motivation 1: Nested product strictness

Consider

module Lib (g) where

f :: (Bool, Bool) -> (Bool, Bool)
f pr = (case pr of (a,b) -> a /= b, True)
{-# NOINLINE f #-}

g :: Int -> (Bool, Bool)
g x = let y = let z = odd x in (z,z) in f y

What we would like here is to case bind z instead of let-binding it, simply because we know that if y is passed to f as formal parameter pr and then evaluated in the body of f, then f will also evaluate both components of pr.

Alas, note that f is lazy in pr, signature <ML>, because pr is only evaluated when the firsst component of the result is forced. So demand analysis lacks the expressive power to propagate that z is actually used strictly (assuming an evaluation of its binding context within y).

Proposed solution

We could give f a demand signature <MP(1P,1P)> that says "f uses its argument pr lazily but at most once (M), but assuming at least one such evaluation, it will evaluate both components of the pair strictly.". Then we can put a sub-demand of P(1P,1P) on the body of y, seeing that z is indeed used strictly.

Motivation 2: Eta reduction

In #20040 (closed), we are trying to asses whether it's sound to eta-reduce an argument ifoldl' (\x. f x) ... to ifoldl' f. That depends on how ifoldl' uses its argument. Here's a simpler version of the particular definition of ifoldl':

myfoldl :: (a -> b -> a) -> a -> [b] -> a
myfoldl f z [] = z
myfoldl f !z (x:xs) = myfoldl (\x -> f x) (f z x) xs

Today, GHC won't manage to eta-reduce \x -> f x in the recursive call to myfoldl.

A sufficient condition to enable eta-reduction would be if myfoldl put demand SCS(C1(L)) on f, meaning that "f is evaluated strictly (possibly multiple times) and is also called strictly with two arguments". Unfortunately, f is not used strictly because of the base case. Thus, the demand signature of myfoldl says LCL(C1(L)) and it's unsound to eta-expand based on this signature. The following example demonstrates that:

blah :: (Int -> Int -> Int) -> Int -> Int
blah f 0 = 0
blah f 1 = f `seq` 1
blah f x = f (x+1) (x+2) + f (x+3) (x+4)
{-# NOINLINE blah #-}

main = print $ blah (\x. undefined x) 1

blah puts demand LCL(C1(L)) on f. But if we eta-reduce \x. undefined x at blahs call site, the program will crash instead of printing 1.

Proposed solution

The idea is similar to what I propose for nested product demands: Differentiate between myfoldl and blah by

Giving f demand LCS(C1(L)) in myfoldl, meaning "f is evaluated 0, 1 or many times (L), but if it is evaluated, then it is also called (possibly multiple times, so, S) and everytime it is called in such a way, it is immediately applied to another argument (1).".
Giving f demand LCL(C1(L)) in blarg (as before), meaning "f is evaluated 0, 1 or many times (L), but if it is evaluated, then it might be called 0, 1 or many times (L) and everytime it is called in such a way, it is immediately applied to another argument (1).". NB: Assuming at least one evaluation, it might be not called at all, 0 times, in the second clause. That is the important distinction that we are trying to make! At the moment, we leave this expressiveness on the table.

General idea

The general idea for the change discussed here is that whenever we have a Demand n :* sd, and the Cardinality n has a lower bound of 0 (eg., the whole demand is a lazy demand), then we can assume that the SubDemand sd actually approximates over (the subset of) all strict evaluation contexts. Today, a lazy outer demand "pulls down" the immediately nested lower card bounds in sd to 0 ("lazyfying" them), needlessly destroying information.

Note that when the lower bound is 1, then sd still applies absolutely, lest we risk regressions. See "Regression tests" below for examples. That makes for a complicated implementation.

Relative vs. Absolute

In #18885 (closed), we say that sd applies relatively to the assumed strict demand, if and only if n is lazy.

But the mess in #18885 (closed) also shows that usage demands (e.g., upper bounds) still apply absolutely, independently of lower bounds. So an upper bound of 1 somewhere means that the thing is evaluated at most once wrt. to the whole demand. (The exception is SubDemands inside call demands, which always apply relatively to exactly one call, corresponding to the notion that lambda bodies are freshly instantiated on each call.) I think the relative-absolute tension will pose quite a few challenges for the implementation.

This ticket is for figuring out the "why" (in fact, I hope it just did) and the "how", in particular without losing expressive power in other scenarios and thus risking regressions.

Regression tests

#18885 (closed) painfully brought up examples that showcase what doesn't work.

Requirement: upper bounds in `sd` in call demands `Cn(sd)` apply relatively

We already have relative call demands; E.g., SCS(C1(L)) means "eval'd multiple times and called multiple times, and always with two arguments", denoting a use of f in an expression like f 1 2 + f 3 4. We can use this information to eta-expand f to arity 2, for example.

Note that this doesn't apply anything for whether the lower bound (strictness) of sd should apply relatively or absolutely.

Requirement: Absolute product (usage) demands

Why are product demands absolute in the first place? The reason is single-entry information. Consider

h p 0 = 0
h p y = fst p + snd p + y

blurg x =
  let a = sum [0..x]
      b = sum [1..x]
  in h (a,b) x

Today, h has demand signature <LP(ML,ML)><1A>. This enough for us to employ call-by-name for the single-entry thunks a and b at h's call site!

If we had relative product demands (NB: not only strictness, also usage), we'd get the same signature, but exploiting it in the same way would be unsound! Example:

h2 p 0 = 0
h2 p y = fst p + fst p + snd p + snd p

blurg2 x =
  let a = sum [0..x]
      b = sum [1..x]
  in h2 (a,b) x

Assuming relative product demands, we'd give h2 the same signature <LP(ML,ML)><1A>, because each eval of p will evaluate each component at most once. But a and b are actually evaluated (at most) twice each! So the signature actually means something different and we thus can't assume call-by-name for a and b. The M/L distinction in nested product usage suddenly has become worthless.

Conclusion: We want to stick to absolute product usage.

`plusDmd` must `lub` in lazy demands

Consider

i :: Bool -> Bool -> (Int, Int) -> Int
i b b' x = (if b then fst x else 3) + (if b' then snd x else 4)

i must not get a signature that puts demand LP(1L,1L) on the pair x, because not all evals of x will also use both components strictly; counter example is the call site i True False (1,undefined). Although it will evaluate the pair and 1, it won't evaluate undefined. LP(ML,ML) is the sound choice here. That implies the following equation for plusDmd:

MP(1L,A) + MP(A,1L) = LP(ML,ML) (so we plusCard the outer demands and then have to lub the lower bounds and plus the upper bounds on the nested ones)

whereas if the outer Ms where strict demands 1, we still want

1P(1L,a) + 1P(A,1L) = SP(1L,1L) (so plus all the way because both nested demands describe all traces)

"General idea" above says that we want relative treatment of a lazy Demand's nested SubDemands. Can we relax the "lazy" qualifier, thus simplifying our idea? The difference in plusDmd hints at the answer: No. Example:

fst' :: (a,b) -> a
fst' (x,_) = x
{-# NOINLINE fst' #-}

j p = p `seq` fst p

Today, j puts demand SP(1L,A) on p. We want to see that j is strict in the first component of p. If we gave the involved strict demands a relative treatment, we'd have to say SP(ML,A), because not every eval of p will evaluate the first component.

Edited Feb 25, 2022 by Sebastian Graf

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

DmdAnal: A lazy Demand's SubDemand should be relative to at least one evaluation

Motivation 1: Nested product strictness

Proposed solution

Motivation 2: Eta reduction

Proposed solution

General idea

Relative vs. Absolute

Regression tests

Requirement: upper bounds in sd in call demands Cn(sd) apply relatively

Requirement: Absolute product (usage) demands

plusDmd must lub in lazy demands

Requirement: upper bounds in `sd` in call demands `Cn(sd)` apply relatively

`plusDmd` must `lub` in lazy demands