CallArity regression in NoFib's simple
Originally reported in #19001 (comment 334105), but further investigation reveals that the situation is a bit more complicated than the thread suggests. simple
defines a function polynomial
. Here it goes:
polynomial :: (Array (Int, Int) (Array (Int, Int) Double)) -> Int -> (Array Int Double) -> (Array Int Double) -> Double -> Double -> Double
polynomial g degree rho_table theta_table rho_value theta_value =
let {
table_search table value =
let {
(low, high) = bounds table ;
search_down i = if value > (table!(i - 1)) then
i else search_down (i - 1)
} in if value > (table!high) then
high + 1 else if value <= (table!low) then
low else search_down high ;
rho_index = table_search rho_table rho_value ;
theta_index = table_search theta_table theta_value ;
a = g!(rho_index, theta_index)
} in foldl (+) 0 [ ((a!(i, j)) * rho_value ^ (i::Int))
* theta_value ^ j | i <- [0..degree],
j <- [0..degree] ]
The gist here is that we fail to eta-expand the left fold's go
function. It's a case we would think we could catch with CallArity or DmdAnal (after fixing #14816). This is the output of -ddump-call-arity
:
polynomial
= \ (g_awx :: Array (Int, Int) (Array (Int, Int) Double))
(degree_awy :: Int)
... and 4 other args ... ->
case degree_awy of { GHC.Types.I# y_a1R8 ->
case GHC.Prim.># 0# y_a1R8 of {
__DEFAULT ->
letrec {
go3_a1Rg [Occ=LoopBreaker] :: GHC.Prim.Int# -> Double -> Double
[LclId, Arity=1, CallArity=1]
go3_a1Rg
= \ (x_a1Rh :: GHC.Prim.Int#) ->
let {
n_X1 :: Double -> Double
n_X1
= case GHC.Prim.==# x_a1Rh y_a1R8 of {
__DEFAULT -> go3_a1Rg (GHC.Prim.+# x_a1Rh 1#);
1# -> id @Double
} } in
case GHC.Prim.># 0# y_a1R8 of {
__DEFAULT ->
letrec {
go3_X3 [Occ=LoopBreaker] :: GHC.Prim.Int# -> Double -> Double
go3_X3
= \ (x_X4 :: GHC.Prim.Int#) (v_a1Pa [OS=OneShot] :: Double) ->
let {
karg_s1Wv :: Double
karg_s1Wv = <irrelevant> } in
case GHC.Prim.==# x_X4 y_a1R8 of {
__DEFAULT -> go3_X3 (GHC.Prim.+# x_X4 1#) karg_s1Wv;
1# -> n_X1 karg_s1Wv
}; } in
go3_X3 0#;
1# -> n_X1
}; } in
go3_a1Rg 0# lvl_s1Un;
1# -> lvl_s1Un
}
}
Note that go3_a1Rg
has CallArity 1. That means we don't eta-expand it and subsequent demand analysis will fail to unbox basically every binding (elided) in that Double -> Double
loop.
The code properly eta-expands on 8.10, because we figure out CallArity
1 for n_X1
. So why doesn't CallArity eta-expand today? I don't know. The only difference in 8.10 in the Core that CallArity sees is that n
has a different unique; X2g5
, not the generic local unique X1
. So there is the slight possiblity that this is a shadowing issue, but it's hard to tell. It could also be related to one of the bug fixes in which we made CallArity more conservative?
So this is a regression in the effectiveness of CallArity.