Stabilise nofib runtime measurements

With D4989 (cf. #15357 (closed)) having hit nofib master, there are still many benchmarks that are unstable in one way or another. I identified three causes for unstability in #5793##15999 (closed). With system overhead mostly out of the equation, there are still two related tasks left:

Identify benchmarks with GC wibbles. Plan: Look at how productivity rate changes while increasing gen 0 heap size. A GC-sensitive benchmark should have a non-monotonic or discontinuous productivity-rate-over-nursery-size curve. Then fix these by iterating main often enough for the curve to become smooth and monotone.
Now, all benchmarks should have monotonically decreasing instruction count for increasing nursery sizes. If not, maybe there's another class of benchmarks I didn't identify yet in #5793. Of these benchmarks, there are a few, like real/eff/CS, that still have highly code layout-sensitive runtimes. Fix these 'microbenchmarks' by hiding them behind a flag.

Edited Mar 10, 2019 by Sebastian Graf

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information