WIP: Coercion zapping when not building with -dcore-lint
Coercions for even small programs can grow to be quite large (e.g. Trac #8095), especially when type families are involved. For instance, the case of addition of inductive naturals can build coercions quadratic in size of the summands.
In this patch we avoid this quadratic cost by eliminating coercions from Core when they are not needed (e.g. when we aren't Core Linting). Through careful representation we can achieve an asymptotic improvement in typechecking performance in type-family-heavy programs.
See Note [Zapping Coercions]
for details.
Thanks to @goldfire for his inspiration and guidance. Performance
Below I've tabulated the CPU time (in seconds) of ghc -freduction-depth=0 -v0
compiling
the testcase generated by Types.hs
attached to Trac #8095 (specifically the a a
variant) with
and without coercion zapping (building with ghc -dcore-lint -ddrop-coercions
and
ghc -dcore-lint
, respectively):
n | no-zap | zap |
---|---|---|
100 | 0.35 | 0.34 |
200 | 0.56 | 0.41 |
400 | 1.01 | 0.47 |
800 | 2.28 | 0.69 |
1600 | 7.16 | 1.11 |
3200 | 27.27 | 2.09 |
6400 | 121.90 | 4.02 |
Looks pretty linear to me.
The results are even more dramatic when the test is run in the b a
variant:
n | no-zap | zap |
---|---|---|
100 | 0.83 | 0.35 |
200 | 3.03 | 0.35 |
400 | 17.98 | 0.51 |
600 | 53.57 | 0.61 |
800 | 148.76 | 0.70 |
Here we see that compilation times remain nearly zero with this patch long after they become completely untenable under 8.2.2.
The story is pretty similar under the c a
variant:
n | no-zap | zap |
---|---|---|
100 | 0.59 | 0.34 |
200 | 1.62 | 0.38 |
400 | 7.39 | 0.56 |
600 | 19.79 | 0.61 |
800 | 45.47 | 0.94 |
Even the allocation numbers of some testsuite cases benefit pretty significantly:
testcase | allocations | compile time |
---|---|---|
T5030 | -84.8% | -68.7% |
T12227 | -42.1% | -20.4% |
T12545 | -73.1% | -64.4% |