T12150 is incredibly fragile
Over the past few years I have noticed that T12150
seems to consistently fail due to random performance metric fluctuations (in the Bytes Allocated
metric in particular). This has been particularly bad in recent months. For instance, !2905 (closed) languished in the merge queue for months due to several failures of this test.
The test itself is quite small, allocating only a few dozen megabytes. Moreover, it has a very tight (1%) acceptance threshold despite testing what appears to be asymptotic performance behavior. We should significantly widen the acceptance threshold and perhaps increase the size of the test to ensure that it fails if we regress.