Bring sanity to our performance testsuite
The GHC testsuite's performance tests tends to be a constant source of busy-work as it requires that contributors manually bump performance numbers and propagate these changes across platforms. Moreover, the testsuite is poor at noticing performance regressions due to false positive failures (due to spurious environmental differences) and false negatives (due to the rather generous acceptance windows that many tests have).
Joachim and I briefly discussed this at Hac Phi and came up with the following proposal:
- We rip expected performance numbers out of the
- We replace the existing
compiler_stats_num_fieldtest modifier with an implementation that dumps performance metrics to a CSV file
- Introduce a tool to compare this CSV file to metrics associated with a ancestor commit via
- The tool would also be able to add notes