New plan for baseline selection of performance metrics
Currently the plan for baseline selection in the testsuite driver is both complex and unpredictable. Concretely, to find a baseline value for a given test and metric we do the following
- we walk backwards from the current commit...
- if we find that a change in the given test was accepted (via the commit message) then we stop the walk and return the recorded measurement if any
- if we find a recorded measurement, we return it
- if we reach a depth limit of 75 commits then we give up
This leads to a number of surprising behaviors, particular when pushing the
git notes at the end of the job fails.
A new plan
To fix this, we decided on the following, less complex plan:
- Every MR uses its base commit as its baseline
- Every Marge batch pushes its metrics before being merged into
This allows the following invariant: the tip of master always has accurate perf numbers