DmdAnal: Implement Boxity Analysis (#19871)
This patch fixes some abundant reboxing of `DynFlags` in `GHC.HsToCore.Match.Literal.warnAboutOverflowedLit` (which was the topic of #19407) by introducing a Boxity analysis to GHC, done as part of demand analysis. This allows to accurately capture ad-hoc unboxing decisions previously made in worker/wrapper in demand analysis now, where the boxity info can propagate through demand signatures. See the new `Note [Boxity analysis]`. The actual fix for #19407 is described in `Note [No lazy, Unboxed demand in demand signature]`, but `Note [Finalising boxity for demand signature]` is probably a better entry-point. To support the fix for #19407, I had to change (what was) `Note [Add demands for strict constructors]` a bit (now `Note [Unboxing evaluated arguments]`). In particular, we now take care of it in `finaliseBoxity` (which is only called from demand analaysis) instead of `wantToUnboxArg`. I also had to resurrect `Note [Product demands for function body]` and rename it to `Note [Unboxed demand on function bodies returning small products]` to avoid huge regressions in `join004` and `join007`, thereby fixing #4267 again. See the updated Note for details. A nice side-effect is that the worker/wrapper transformation no longer needs to look at strictness info and other bits such as `InsideInlineableFun` flags (needed for `Note [Do not unbox class dictionaries]`) at all. It simply collects boxity info from argument demands and interprets them with a severely simplified `wantToUnboxArg`. All the smartness is in `finaliseBoxity`, which could be moved to DmdAnal completely, if it wasn't for the call to `dubiousDataConInstArgTys` which would be awkward to export. I spent some time figuring out the reason for why `T16197` failed prior to my amendments to `Note [Unboxing evaluated arguments]`. After having it figured out, I minimised it a bit and added `T16197b`, which simply compares computed strictness signatures and thus should be far simpler to eyeball. The 12% ghc/alloc regression in T11545 is because of the additional `Boxity` field in `Poly` and `Prod` that results in more allocation during `lubSubDmd` and `plusSubDmd`. I made sure in the ticky profiles that the number of calls to those functions stayed the same. We can bear such an increase here, as we recently improved it by -68% (in b760c1f7). T18698* regress slightly because there is more unboxing of dictionaries happening and that causes Lint (mostly) to allocate more. Fixes #19871, #19407, #4267, #16859, #18907 and #13331. Metric Increase: T11545 T18698a T18698b Metric Decrease: T12425 T16577 T18223 T18282 T4267 T9961
Showing
- compiler/GHC/Builtin/PrimOps.hs 1 addition, 1 deletioncompiler/GHC/Builtin/PrimOps.hs
- compiler/GHC/Core/Make.hs 0 additions, 1 deletioncompiler/GHC/Core/Make.hs
- compiler/GHC/Core/Opt/CprAnal.hs 14 additions, 22 deletionscompiler/GHC/Core/Opt/CprAnal.hs
- compiler/GHC/Core/Opt/DmdAnal.hs 171 additions, 100 deletionscompiler/GHC/Core/Opt/DmdAnal.hs
- compiler/GHC/Core/Opt/Pipeline.hs 1 addition, 0 deletionscompiler/GHC/Core/Opt/Pipeline.hs
- compiler/GHC/Core/Opt/SpecConstr.hs 1 addition, 1 deletioncompiler/GHC/Core/Opt/SpecConstr.hs
- compiler/GHC/Core/Opt/WorkWrap.hs 1 addition, 4 deletionscompiler/GHC/Core/Opt/WorkWrap.hs
- compiler/GHC/Core/Opt/WorkWrap/Utils.hs 225 additions, 160 deletionscompiler/GHC/Core/Opt/WorkWrap/Utils.hs
- compiler/GHC/Driver/Session.hs 6 additions, 0 deletionscompiler/GHC/Driver/Session.hs
- compiler/GHC/Types/Basic.hs 6 additions, 0 deletionscompiler/GHC/Types/Basic.hs
- compiler/GHC/Types/Demand.hs 528 additions, 227 deletionscompiler/GHC/Types/Demand.hs
- compiler/GHC/Types/Id/Make.hs 6 additions, 3 deletionscompiler/GHC/Types/Id/Make.hs
- compiler/GHC/Utils/Misc.hs 17 additions, 6 deletionscompiler/GHC/Utils/Misc.hs
- docs/users_guide/using-optimisation.rst 14 additions, 0 deletionsdocs/users_guide/using-optimisation.rst
- testsuite/tests/arityanal/should_compile/Arity04.stderr 8 additions, 8 deletionstestsuite/tests/arityanal/should_compile/Arity04.stderr
- testsuite/tests/arityanal/should_compile/Arity11.stderr 1 addition, 1 deletiontestsuite/tests/arityanal/should_compile/Arity11.stderr
- testsuite/tests/arityanal/should_compile/Arity14.stderr 1 addition, 1 deletiontestsuite/tests/arityanal/should_compile/Arity14.stderr
- testsuite/tests/arityanal/should_compile/T18793.stderr 17 additions, 21 deletionstestsuite/tests/arityanal/should_compile/T18793.stderr
- testsuite/tests/cpranal/should_compile/T18109.hs 2 additions, 2 deletionstestsuite/tests/cpranal/should_compile/T18109.hs
- testsuite/tests/cpranal/should_compile/T18109.stderr 18 additions, 18 deletionstestsuite/tests/cpranal/should_compile/T18109.stderr
Loading
Please register or sign in to comment