Skip to content
  • Sebastian Graf's avatar
    3bab222c
    DmdAnal: Implement Boxity Analysis (#19871) · 3bab222c
    Sebastian Graf authored and Marge Bot's avatar Marge Bot committed
    This patch fixes some abundant reboxing of `DynFlags` in
    `GHC.HsToCore.Match.Literal.warnAboutOverflowedLit` (which was the topic
    of #19407) by introducing a Boxity analysis to GHC, done as part of demand
    analysis. This allows to accurately capture ad-hoc unboxing decisions previously
    made in worker/wrapper in demand analysis now, where the boxity info can
    propagate through demand signatures.
    
    See the new `Note [Boxity analysis]`. The actual fix for #19407 is described in
    `Note [No lazy, Unboxed demand in demand signature]`, but
    `Note [Finalising boxity for demand signature]` is probably a better entry-point.
    
    To support the fix for #19407, I had to change (what was)
    `Note [Add demands for strict constructors]` a bit
    (now `Note [Unboxing evaluated arguments]`). In particular, we now take care of
    it in `finaliseBoxity` (which is only called from demand analaysis) instead of
    `wantToUnboxArg`.
    
    I also had to resurrect `Note [Product demands for function body]` and rename
    it to `Note [Unboxed demand on function bodies returning small products]` to
    avoid huge regressions in `join004` and `join007`, thereby fixing #4267 again.
    See the updated Note for details.
    
    A nice side-effect is that the worker/wrapper transformation no longer needs to
    look at strictness info and other bits such as `InsideInlineableFun` flags
    (needed for `Note [Do not unbox class dictionaries]`) at all. It simply collects
    boxity info from argument demands and interprets them with a severely simplified
    `wantToUnboxArg`. All the smartness is in `finaliseBoxity`, which could be moved
    to DmdAnal completely, if it wasn't for the call to `dubiousDataConInstArgTys`
    which would be awkward to export.
    
    I spent some time figuring out the reason for why `T16197` failed prior to my
    amendments to `Note [Unboxing evaluated arguments]`. After having it figured
    out, I minimised it a bit and added `T16197b`, which simply compares computed
    strictness signatures and thus should be far simpler to eyeball.
    
    The 12% ghc/alloc regression in T11545 is because of the additional `Boxity`
    field in `Poly` and `Prod` that results in more allocation during `lubSubDmd`
    and `plusSubDmd`. I made sure in the ticky profiles that the number of calls
    to those functions stayed the same. We can bear such an increase here, as we
    recently improved it by -68% (in b760c1f7).
    T18698* regress slightly because there is more unboxing of dictionaries
    happening and that causes Lint (mostly) to allocate more.
    
    Fixes #19871, #19407, #4267, #16859, #18907 and #13331.
    
    Metric Increase:
        T11545
        T18698a
        T18698b
    
    Metric Decrease:
        T12425
        T16577
        T18223
        T18282
        T4267
        T9961
    3bab222c
    DmdAnal: Implement Boxity Analysis (#19871)
    Sebastian Graf authored and Marge Bot's avatar Marge Bot committed
    This patch fixes some abundant reboxing of `DynFlags` in
    `GHC.HsToCore.Match.Literal.warnAboutOverflowedLit` (which was the topic
    of #19407) by introducing a Boxity analysis to GHC, done as part of demand
    analysis. This allows to accurately capture ad-hoc unboxing decisions previously
    made in worker/wrapper in demand analysis now, where the boxity info can
    propagate through demand signatures.
    
    See the new `Note [Boxity analysis]`. The actual fix for #19407 is described in
    `Note [No lazy, Unboxed demand in demand signature]`, but
    `Note [Finalising boxity for demand signature]` is probably a better entry-point.
    
    To support the fix for #19407, I had to change (what was)
    `Note [Add demands for strict constructors]` a bit
    (now `Note [Unboxing evaluated arguments]`). In particular, we now take care of
    it in `finaliseBoxity` (which is only called from demand analaysis) instead of
    `wantToUnboxArg`.
    
    I also had to resurrect `Note [Product demands for function body]` and rename
    it to `Note [Unboxed demand on function bodies returning small products]` to
    avoid huge regressions in `join004` and `join007`, thereby fixing #4267 again.
    See the updated Note for details.
    
    A nice side-effect is that the worker/wrapper transformation no longer needs to
    look at strictness info and other bits such as `InsideInlineableFun` flags
    (needed for `Note [Do not unbox class dictionaries]`) at all. It simply collects
    boxity info from argument demands and interprets them with a severely simplified
    `wantToUnboxArg`. All the smartness is in `finaliseBoxity`, which could be moved
    to DmdAnal completely, if it wasn't for the call to `dubiousDataConInstArgTys`
    which would be awkward to export.
    
    I spent some time figuring out the reason for why `T16197` failed prior to my
    amendments to `Note [Unboxing evaluated arguments]`. After having it figured
    out, I minimised it a bit and added `T16197b`, which simply compares computed
    strictness signatures and thus should be far simpler to eyeball.
    
    The 12% ghc/alloc regression in T11545 is because of the additional `Boxity`
    field in `Poly` and `Prod` that results in more allocation during `lubSubDmd`
    and `plusSubDmd`. I made sure in the ticky profiles that the number of calls
    to those functions stayed the same. We can bear such an increase here, as we
    recently improved it by -68% (in b760c1f7).
    T18698* regress slightly because there is more unboxing of dictionaries
    happening and that causes Lint (mostly) to allocate more.
    
    Fixes #19871, #19407, #4267, #16859, #18907 and #13331.
    
    Metric Increase:
        T11545
        T18698a
        T18698b
    
    Metric Decrease:
        T12425
        T16577
        T18223
        T18282
        T4267
        T9961
Loading