Skip to content

Consider generating Core directly when deriving `Generic` instances

Deriving the Generic type class is often slow. There are many open issues related to this. For example: #5642, #9557, #16577 & #19204.

One interesting approach to speeding this up is to generate Core directly. Doing so would bypass the renamer, typechecker, and desugarer. Is this an avenue worth exploring?

I made !15818 as a proof of concept for this approach. The results are promising:

N Type -O Alloc (MB, old) Alloc (MB, new) Time (s, old) Time (s, new)
1 fields 0 36 34 0.121s 0.122s
10 fields 0 46 42 0.127s 0.127s
100 fields 0 168 151 0.180s 0.169s
1000 fields 0 3,547 3,387 1.482s 1.389s
1 fields 2 42 41 0.119s 0.125s
10 fields 2 65 63 0.132s 0.124s
100 fields 2 356 342 0.244s 0.238s
1000 fields 2 7,058 6,913 2.398s 2.287s
1 ctors 0 36 34 0.123s 0.118s
10 ctors 0 54 50 0.132s 0.125s
100 ctors 0 313 277 0.337s 0.261s
1000 ctors 0 8,212 8,071 12.012s 7.928s
1 ctors 2 41 40 0.123s 0.116s
10 ctors 2 86 82 0.133s 0.139s
100 ctors 2 477 444 0.435s 0.360s
1000 ctors 2 9,332 9,285 17.387s 12.080s

Those benchmarks compare the traditional HsSyn derivation path against the direct Core generation path (-fdirect-core-generic-deriving). Each benchmark compiles a single module containing one data type with the specified number of fields or constructors, deriving Generic.

All times are elapsed (wall clock). Measured on AArch64 Linux with a stage1 GHC. Exact benchmark is here: https://gist.github.com/tfausak/3a17bb415a836612ca8f070b146785bf.

Edited by Taylor Fausak
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information