Make CPR not lose sharing by returning an unboxed sum
Consider
data T
= MkT
{ a :: Int
, b :: Int
, c :: Int
, d :: Int
}
ones :: T
ones = MkT 1 1 1 1
twos :: T
twos = MkT 2 2 2 2
fun :: Int -> T
fun x
| even x = ones
| otherwise = twos
{-# NOINLINE fun #-}
We give fun
the CPR property today. That makes worker/wrapper return fun
's result unboxed:
fun
= \ (w_s1xD :: Int) ->
case w_s1xD of { GHC.Types.I# ww1_s1xG ->
case Lib.$wfun ww1_s1xG of
{ (# ww3_s1xM, ww4_s1xN, ww5_s1xO, ww6_s1xP #) ->
Lib.MkT ww3_s1xM ww4_s1xN ww5_s1xO ww6_s1xP
}
}
But now we changed something that allocates O(1) space into something that allocates O(k) space, where k
is the number of call sites! E.g., we lost the sharing of ones
and twos
, which the programmer might have written on purpose.
See also #13331 (comment 325602), #19190, #19276, where this and similar problems come up in practice.
I argue that we shouldn't unbox ones
and twos
above. Does that mean we shouldn't give it the CPR property at all? Not so fast! Consider
fun2 :: Int -> T
fun2 42 = ones
fun2 x = MkT x x x x
Here it would be terrible not to return the fields of MkT
, assuming that most of the call sites probably won't pass 42! We are punishing the common case in favor of the rare. In general, it's hard to estimate whether we are seeing the common case or not, let alone making all kinds of analyses aware of it. Thus I propose to transform as follows instead:
$wfun2 :: Int# -> (# T | (# Int, Int, Int, Int #) #)
$wfun2 42# = (# ones | #)
$wfun2 x = (# | (# I# x, I# x, I# x, I# x #) #)
fun2 :: Int -> T
fun2 (I# x) = case $wfun2 x of
(# t | #) -> t
(# | (# a, b, c, d #) #) -> MkT a b c d
Thus, no sharing is lost! (NB: If ones
and twos
are bound outside the function body, the wrapper can reference them directly, a case of #19240. In that case, we can omit the T
field here.)
Since there are unboxed sums involved, this looks a bit like Sum CPR (#14259), and that's indeed right, I think. In Sum CPR, the CPR lattice becomes a powerset lattice. But saying TopCpr
means we are doomed and will get bad results. But why not still return the constructor fields in the cases we would construct it? Another example:
fun3 :: Int -> [T] -> T
fun3 n ts
| n < length ts = ts !! n
| otherwise = T n n n n
We currently don't give fun3
the CPR property, because we can't cancel away the allocation of the boxes in ts
. And neither should we! But what if n
is often larger than length ts
? Then we always heap-allocate the T
. We could very well get rid of the allocation by doing
$wfun3 :: Int# -> [T] -> (# T | (# Int, Int, Int, Int #) #)
$wfun3 n ts
| n < length ts = (# ts !! n | #) -- modulo int unwrapping etc.
| otherwise = (# | (# I# n, I# n, I# n, I# n #) #)
fun3 :: Int -> [T] -> T
fun3 (I# x) ts = case $wfun2 x ts of
(# t | #) -> t
(# | (# a, b, c, d #) #) -> MkT a b c d
All without ever wasting a box in [T]
!