Skip to content

CallArity regression in NoFib's simple

Originally reported in #19001 (comment 334105), but further investigation reveals that the situation is a bit more complicated than the thread suggests. simple defines a function polynomial. Here it goes:

polynomial ::  (Array (Int, Int) (Array (Int, Int) Double)) -> Int -> (Array Int Double) -> (Array Int Double) -> Double -> Double -> Double
polynomial g degree rho_table theta_table rho_value theta_value =

   let {
       table_search table value =
          let {
              (low, high) = bounds table ;
              search_down i = if value > (table!(i - 1)) then
                                i else search_down (i - 1)
              } in if value > (table!high) then
                     high + 1 else if value <= (table!low) then
                                     low else search_down high ;
       rho_index = table_search rho_table rho_value ;
       theta_index = table_search theta_table theta_value ;
       a = g!(rho_index, theta_index)
       } in foldl (+) 0 [ ((a!(i, j)) * rho_value ^ (i::Int))
                          * theta_value ^ j | i <- [0..degree],
                                              j <- [0..degree] ]

The gist here is that we fail to eta-expand the left fold's go function. It's a case we would think we could catch with CallArity or DmdAnal (after fixing #14816). This is the output of -ddump-call-arity:

polynomial
  = \ (g_awx :: Array (Int, Int) (Array (Int, Int) Double))
      (degree_awy :: Int)
      ... and 4 other args ... ->
      case degree_awy of { GHC.Types.I# y_a1R8 ->
      case GHC.Prim.># 0# y_a1R8 of {
        __DEFAULT ->
          letrec {
            go3_a1Rg [Occ=LoopBreaker] :: GHC.Prim.Int# -> Double -> Double
            [LclId, Arity=1, CallArity=1]
            go3_a1Rg
              = \ (x_a1Rh :: GHC.Prim.Int#) ->
                  let {
                    n_X1 :: Double -> Double
                    n_X1
                      = case GHC.Prim.==# x_a1Rh y_a1R8 of {
                          __DEFAULT -> go3_a1Rg (GHC.Prim.+# x_a1Rh 1#);
                          1# -> id @Double
                        } } in
                  case GHC.Prim.># 0# y_a1R8 of {
                    __DEFAULT ->
                      letrec {
                        go3_X3 [Occ=LoopBreaker] :: GHC.Prim.Int# -> Double -> Double
                        go3_X3
                          = \ (x_X4 :: GHC.Prim.Int#) (v_a1Pa [OS=OneShot] :: Double) ->
                              let {
                                karg_s1Wv :: Double
                                karg_s1Wv = <irrelevant> } in
                              case GHC.Prim.==# x_X4 y_a1R8 of {
                                __DEFAULT -> go3_X3 (GHC.Prim.+# x_X4 1#) karg_s1Wv;
                                1# -> n_X1 karg_s1Wv
                              }; } in
                      go3_X3 0#;
                    1# -> n_X1
                  }; } in
          go3_a1Rg 0# lvl_s1Un;
        1# -> lvl_s1Un
      }
      }

Note that go3_a1Rg has CallArity 1. That means we don't eta-expand it and subsequent demand analysis will fail to unbox basically every binding (elided) in that Double -> Double loop.

The code properly eta-expands on 8.10, because we figure out CallArity 1 for n_X1. So why doesn't CallArity eta-expand today? I don't know. The only difference in 8.10 in the Core that CallArity sees is that n has a different unique; X2g5, not the generic local unique X1. So there is the slight possiblity that this is a shadowing issue, but it's hard to tell. It could also be related to one of the bug fixes in which we made CallArity more conservative?

So this is a regression in the effectiveness of CallArity.

Edited by Sebastian Graf
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information