DmdAnal: Don't unbox recursive data types (#11545)
As Note [Demand analysis for recursive data constructors]
describes, we now
refrain from unboxing recursive data type arguments, for two reasons:
- Relating to run/alloc perf: Similar to
Note [CPR for recursive data constructors]
, it seldomly improves run/alloc performance if we just unbox a finite number of layers of a potentially huge data structure. - Relating to ghc/alloc perf: Inductive definitions on single-product recursive data types like the one in T11545 will (diverge, and) have very deep demand signatures before any other abortion mechanism in Demand analysis is triggered. That leads to great and unnecessary churn on Demand analysis when ultimately we will never make use of any nested strictness information anyway.
Conclusion: Discard nested demand and boxity information on such recursive types
with the help of Note [Detecting recursive data constructors]
.
I also implemented GHC.Types.Unique.MemoFun.memoiseUniqueFun
in order to avoid
the overhead of repeated calls to GHC.Core.Opt.WorkWrap.Utils.isRecDataCon
.
It's nice and simple and guards against some smaller regressions in T9233 and
T16577.
ghc/alloc performance-wise, this patch is a very clear win:
Test Metric value New value Change
---------------------------------------------------------------------------------------
LargeRecord(normal) ghc/alloc 6,141,071,720 6,099,871,216 -0.7%
MultiLayerModulesTH_OneShot(normal) ghc/alloc 2,740,973,040 2,705,146,640 -1.3%
T11545(normal) ghc/alloc 945,475,492 85,768,928 -90.9% GOOD
T13056(optasm) ghc/alloc 370,245,880 326,980,632 -11.7% GOOD
T18304(normal) ghc/alloc 90,933,944 76,998,064 -15.3% GOOD
T9872a(normal) ghc/alloc 1,800,576,840 1,792,348,760 -0.5%
T9872b(normal) ghc/alloc 2,086,492,432 2,073,991,848 -0.6%
T9872c(normal) ghc/alloc 1,750,491,240 1,737,797,832 -0.7%
TcPlugin_RewritePerf(normal) ghc/alloc 2,286,813,400 2,270,957,896 -0.7%
geo. mean -2.9%
No noteworthy change in run/alloc either.
NoFib results show slight wins, too:
--------------------------------------------------------------------------------
Program Allocs Instrs
--------------------------------------------------------------------------------
constraints -1.9% -1.4%
fasta -3.6% -2.7%
reverse-complem -0.3% -0.9%
treejoin -0.0% -0.3%
--------------------------------------------------------------------------------
Min -3.6% -2.7%
Max +0.1% +0.1%
Geometric Mean -0.1% -0.1%