Skip to content

Fun with the demand analyser

Max writes: I've been trying to speed up the supercompiler, and in the process I've noticed some infelicities in the demand analyser that might be worth looking at. One is #5949 (closed). The other is:

Infelicity 2: a case where demand summarisation hurts us

I found practical examples where summarising the demand transfomer of a function as a single strictness signature hurt us. The examples were code like:

h :: (Int, Int) -> Int -> (Int, Int)
h x y = if y > 10
         then x
         else h (case h x 0 of (y1, y2) -> (y2, y1)) (y + 1)

If, at the innermost use of h, we re-analyse h in the demand C(C(U(LL))) rather than just unleashing the vanilla DmdSig computed from the demand C(C(S)) then we can unbox the pair argument. Currently GHC just gives h the DmdType SU(L) which doesn't eliminate the allocation of the pair at all.

This situation occurs in practice with code like:

c :: M.Map Int Int -> (Int, Int)
c m = M.foldrWithKey (\k v (a, b) -> if k + v > 2 then (a, b) else (b,
a)) (0, 1) m

The difference from my earlier example d is that I'm using the version of foldrWithKey that doesn't call seq. Current GHC generates this inner loop:

Rec {
CPR2.c_go2 [Occ=LoopBreaker]
  :: (GHC.Types.Int, GHC.Types.Int)
     -> Data.Map.Base.Map GHC.Types.Int GHC.Types.Int
     -> (GHC.Types.Int, GHC.Types.Int)
[GblId,
 Arity=2,
 Caf=NoCafRefs,
 Str=DmdType U(L*)S,
 Unf=OtherCon []]
CPR2.c_go2 =
  \ (z_spW :: (GHC.Types.Int, GHC.Types.Int))
    (ds_spU :: Data.Map.Base.Map GHC.Types.Int GHC.Types.Int) ->
    case ds_spU of _ {
      Data.Map.Base.Tip -> z_spW;
      Data.Map.Base.Bin rb_sqq kx_sq2 x_sq9 l_sqj r_sq5 ->
        case kx_sq2 of _ { GHC.Types.I# x1_sqc ->
        case CPR2.c_go2 z_spW r_sq5 of wild2_sqk { (a_sqh, b_sqg) ->
        case x_sq9 of _ { GHC.Types.I# y_sqd ->
        case GHC.Prim.+# x1_sqc y_sqd of sat_sqp { __DEFAULT ->
        case GHC.Prim.># sat_sqp 2 of _ {
          GHC.Types.False ->
            let {
              sat_sqo :: (GHC.Types.Int, GHC.Types.Int)
              [LclId]
              sat_sqo = (b_sqg, a_sqh) } in
            CPR2.c_go2 sat_sqo l_sqj;
          GHC.Types.True -> CPR2.c_go2 wild2_sqk l_sqj
        }
        }
        }
        }
        }
    }
end Rec }

I implemented this but the overhead of reanalysing a function at each occurrence site is prohibitive (with my changes paraffins took 2.5s to compile instead of 0.3s). It does fix the problem though.

A scheme like in "stricterness more relevant" but with let-binding that is polymorphic in the use-site demand might be able to detect the extra strictness here. I think Stefan Holdermanns has a paper describing such a system. But this is anyway much harder to fix than my first infelicity.

Edited by Simon Peyton Jones
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information