Skip to content

DmdAnal: Don't unbox recursive data types (#11545)

Sebastian Graf requested to merge wip/dmdanal-rec-datacon into master

As Note [Demand analysis for recursive data constructors] describes, we now refrain from unboxing recursive data type arguments, for two reasons:

  1. Relating to run/alloc perf: Similar to Note [CPR for recursive data constructors], it seldomly improves run/alloc performance if we just unbox a finite number of layers of a potentially huge data structure.
  2. Relating to ghc/alloc perf: Inductive definitions on single-product recursive data types like the one in T11545 will (diverge, and) have very deep demand signatures before any other abortion mechanism in Demand analysis is triggered. That leads to great and unnecessary churn on Demand analysis when ultimately we will never make use of any nested strictness information anyway.

Conclusion: Discard nested demand and boxity information on such recursive types with the help of Note [Detecting recursive data constructors].

I also implemented GHC.Types.Unique.MemoFun.memoiseUniqueFun in order to avoid the overhead of repeated calls to GHC.Core.Opt.WorkWrap.Utils.isRecDataCon. It's nice and simple and guards against some smaller regressions in T9233 and T16577.

ghc/alloc performance-wise, this patch is a very clear win:

                               Test    Metric          value      New value Change
---------------------------------------------------------------------------------------
                LargeRecord(normal) ghc/alloc  6,141,071,720  6,099,871,216  -0.7%
MultiLayerModulesTH_OneShot(normal) ghc/alloc  2,740,973,040  2,705,146,640  -1.3%
                     T11545(normal) ghc/alloc    945,475,492     85,768,928 -90.9% GOOD
                     T13056(optasm) ghc/alloc    370,245,880    326,980,632 -11.7% GOOD
                     T18304(normal) ghc/alloc     90,933,944     76,998,064 -15.3% GOOD
                     T9872a(normal) ghc/alloc  1,800,576,840  1,792,348,760  -0.5%
                     T9872b(normal) ghc/alloc  2,086,492,432  2,073,991,848  -0.6%
                     T9872c(normal) ghc/alloc  1,750,491,240  1,737,797,832  -0.7%
       TcPlugin_RewritePerf(normal) ghc/alloc  2,286,813,400  2,270,957,896  -0.7%

                          geo. mean                                          -2.9%

No noteworthy change in run/alloc either.

NoFib results show slight wins, too:

--------------------------------------------------------------------------------
        Program         Allocs    Instrs
--------------------------------------------------------------------------------
    constraints          -1.9%     -1.4%
          fasta          -3.6%     -2.7%
reverse-complem          -0.3%     -0.9%
       treejoin          -0.0%     -0.3%
--------------------------------------------------------------------------------
            Min          -3.6%     -2.7%
            Max          +0.1%     +0.1%
 Geometric Mean          -0.1%     -0.1%
Edited by Sebastian Graf

Merge request reports