!7844: Draft: Experiment with zapping in the rewriter without directed coercions · Merge requests · Glasgow Haskell Compiler / GHC

sheaf requested to merge sheaf/ghc:zap-coercions into master Mar 23, 2022

In this MR we start off rewriting a zapped reduction and then propagate the zapping using the coercion combinators.

Here are the performance numbers:

test (ghc/alloc)	HEAD	dcoercion+zap	zap
CoOpt_Singletons	984,469,440	1,992,914,088	1,843,420,584
LargeRecord	6,140,072,220	3,327,980,336	2,191,354,760
T12227	479,779,040	293,555,520	267,812,024
T13386	887,913,392	40,795,392	46,132,320
T15703	532,192,308	390,545,584	1,148,686,400
T16577	7,588,794,744	7,961,537,664	7,841,667,048
T18223	1,119,964,672	1,142,598,536	1,132,690,904
T5030	356,224,592	103,302,512	105,683,712
T5642	471,947,420	493,938,064	490,058,416
T8095	3,257,820,628	186,612,624	526,866,304
T9630	1,551,652,176	1,585,077,800	1,610,447,440
T9872a	1,787,159,152	1,928,660,896	578,369,176
T9872b	2,082,232,304	2,214,371,392	761,973,416
T9872b_defer	3,153,955,252	2,282,802,240	4,174,971,640
T9872c	1,728,876,064	1,877,569,608	560,414,352
T9872d	449,285,296	425,046,152	834,352,192

I think the only positive result here is that this patch does significantly better than !7787 on the LargeRecord test. The results on T9872{a,b,c} look good, but as T9872b_defer and T9872d show, when we actually make use of the coercions, we end up much worse off.

One problem with this approach, where we change the combinators such as mkTransCo to zap coercions, is that it forces our hand to aggressively zap everything. For example, we don't use TyConAppCo r tc [Refl arg1, Refl arg2, Zapped small_lhs small_rhs] instead of Zapped big_lhs big_rhs, because if we did do this, then in a composition of several such coercions we would accumulating more and more types and coercions, instead of zapping the whole thing.

Edited Mar 24, 2022 by sheaf

Draft: Experiment with zapping in the rewriter without directed coercions

Merge request reports