compiler: trace SysTools commands to emit start/stop eventlog markers
This patch was motivated by some performance characterization work done for #16822, where we suspected that GHC was spending a lot of time waiting on the linker to be done. (That turned out to be true.)
The tracing is taken care of by ErrUtils.withTiming, so this patch just defines and uses a little wrapper around that function in all the helpers for calling the various systools.
With this patch, assuming a GHC executable linked against an eventlog-capable RTS (RTS ways that contain the debug, profiling or eventlog way units), we can measure how much time is spent in each of the SysTools when building hello.hs by simply doing:
$ ghc hello.hs -ddump-timings +RTS -l
I can then consume the resulting ghc.eventlog
using e.g ghc-events-analyze, to get these totals:
GC 101370050ns 0.101s
USER EVENTS (user events are corrected for GC)
linker 409582941ns 0.410s
cc 120444730ns 0.120s
as 24958584ns 0.025s
Renamer/typechecker [Main] 9512702ns 0.010s
CoreTidy [Main] 2542244ns 0.003s
CodeGen [Main] 2467007ns 0.002s
Desugar [Main] 694036ns 0.001s
Simplifier [Main] 393834ns 0.000s
Chasing dependencies 345840ns 0.000s
Parser [Main] 180313ns 0.000s
CorePrep [Main] 25112ns 0.000s
TOTAL 571147343ns 0.571s
THREAD EVENTS
1 117016ns 0.000s
IOManager on cap 0:2 239521ns 0.000s
TimerManager:3 45753ns 0.000s
4 35739305ns 0.036s
weak finalizer thread:5 12554ns 0.000s
weak finalizer thread:6 1278ns 0.000s
weak finalizer thread:7 4405ns 0.000s
weak finalizer thread:8 543ns 0.000s
weak finalizer thread:9 652ns 0.000s
weak finalizer thread:10 5673ns 0.000s
weak finalizer thread:11 996ns 0.000s
weak finalizer thread:12 760ns 0.000s
weak finalizer thread:13 2521ns 0.000s
weak finalizer thread:14 397ns 0.000s
15 17588ns 0.000s
16 117942ns 0.000s
17 45125ns 0.000s
18 23969ns 0.000s
19 37799ns 0.000s
20 42626ns 0.000s
weak finalizer thread:21 2559ns 0.000s
22 19631ns 0.000s
23 56874ns 0.000s
weak finalizer thread:24 6272ns 0.000s
25 91591ns 0.000s
26 137428ns 0.000s
27 36401ns 0.000s
28 20374ns 0.000s
29 1735ns 0.000s
weak finalizer thread:30 81ns 0.000s
TOTAL 36829369ns 0.037s
See #16822 for more about using this.