Draft: Experiment: directed coercions + zapping
I'm putting up this MR to get help investigating the impact of zapping directed coercions in the coercion optimiser.
The flags -fkeep-dcoercions
and -fzap-dcoercions
can be used to choose how to handle directed coercions in the coercion optimiser. (With this patch, by default, we zap directed coercions, unless -dcore-lint
is enabled.)
Some allocation test results:
CoOpt_Singletons(normal) ghc/alloc 984,469,440 1,992,914,088 +102.4% BAD
LargeRecord(normal) ghc/alloc 6,140,072,220 3,327,980,336 -45.8% GOOD
T12227(normal) ghc/alloc 479,779,040 293,555,520 -38.8% GOOD
T13386(normal) ghc/alloc 887,913,392 40,795,392 -95.4% GOOD
T15703(normal) ghc/alloc 532,192,308 390,545,584 -26.6% GOOD
T16577(normal) ghc/alloc 7,588,794,744 7,961,537,664 +4.9% BAD
T18223(normal) ghc/alloc 1,119,964,672 1,142,598,536 +2.0% BAD
T5030(normal) ghc/alloc 356,224,592 103,302,512 -71.0% GOOD
T5642(normal) ghc/alloc 471,947,420 493,938,064 +4.7% BAD
T8095(normal) ghc/alloc 3,257,820,628 186,612,624 -94.3% GOOD
T9630(normal) ghc/alloc 1,551,652,176 1,585,077,800 +2.2% BAD
T9872a(normal) ghc/alloc 1,787,159,152 1,928,660,896 +7.9% BAD
T9872b(normal) ghc/alloc 2,082,232,304 2,214,371,392 +6.3% BAD
T9872b_defer(normal) ghc/alloc 3,153,955,252 2,282,802,240 -27.6% GOOD
T9872c(normal) ghc/alloc 1,728,876,064 1,877,569,608 +8.6% BAD
T9872d(normal) ghc/alloc 449,285,296 425,046,152 -5.4% GOOD
TcPlugin_RewritePerf(normal) ghc/alloc 2,273,422,028 2,415,191,296 +6.2% BAD
geo. mean -7.7%
This is with zapping of directed coercions in the coercion optimiser.
Some test cases are still better off with other methods. For example:
-
CoOpt_Singletons
is better off if we hydrate directed coercions. We get back to the baseline in allocations. Note that zapping is also quite bad for compile time: it takes about 4.5x longer to compile than if we hydrate. -
LargeRecord
is better off if we keep directed coercions (neither hydrate nor zap). We get down to around 2,400,000,000 (so -60% instead of -45%).