Add section about benchmark categories to README.md

9c2966dc · Sebastian Graf · 0675e131 · 9c2966dc
Commit 9c2966dc authored 6 years ago by Sebastian Graf
--- a/README.md
+++ b/README.md
@@ -150,11 +150,51 @@ If you add a benchmark try to set the problem sizes for
 fast/normal/slow reasonably. [Modes](#modes) lists the recommended brackets for
 each mode.

+### Benchmark Categories
+
+So you have a benchmark to submit but don't know in which subfolder to put it? Here's some
+advice on the intended semantics of each category.
+
+#### Single threaded benchmarks
+
+These are run when you just type `make`. Their semantics is explained in
+[the Nofib paper](https://link.springer.com/chapter/10.1007%2F978-1-4471-3215-8_17)
+(You can find a .ps online, thanks to @bgamari. Alternatively grep for
+'Spectral' in docs/paper/paper.verb).
+
+- `imaginary`: Mostly toy benchmarks, solving puzzles like n-queens.
+- `spectral`: Algorithmic kernels, like FFT. If you want to add a benchmark of a
+  library, this most certainly the place to put it.
+- `real`: Actual applications, with a command-line interface and all. Because of
+  the large dependency footprint of today's applications, these have become
+  rather aged.
+- `shootout`: Benchmarks from
+  [the benchmarks game](https://benchmarksgame-team.pages.debian.net/benchmarksgame/),
+  formerly known as "language shootout".
+
+Most of the benchmarks are quite old and aren't really written in way one would
+write high-performance Haskell code today (e.g., use of `String`, lists,
+redefining own list combinators that don't take part in list fusion, rare use of
+strictness annotations or unboxed data), so new benchmarks for the `real` and
+`spectral` in brackets in particular are always welcome!
+
+#### Other categories
+
+Other than the default single-threaded categories above, there are the
+following (SG: I'm guessing here, have never run them):
+
+- `gc`: Run by `make -C gc` (though you'll probably have to edit the Makefile to
+  your specific config). Select benchmarks from `spectral` and `real`, plus a
+  few more (Careful, these have not been touched by #15999/!5, see the next
+  subsection). Testdrives different GC configs, apparently.
+- `smp`: Microbenchmarks for the `-threaded` runtime, measuring scheduler
+  performance on concurrent and STM-heavy code.
+
 ### Stability wrt. GC paramerisations

 Additionally, pay attention that your benchmarks are stable wrt. different 
 GC paramerisations, so that small changes in allocation don't lead to big,
-unexplicable jumps in performance. See Trac #15999 for details. Also make sure
+unexplicable jumps in performance. See #15999 for details. Also make sure
 that you run the benchmark with the default GC settings, as enlarging Gen 0 or
 Gen 1 heaps just amplifies the problem.

@@ -164,6 +204,6 @@ working set grows and shrinks (e.g. is approximately constant) over the whole
 run of the benchmark. You can ensure this by iterating your main logic $n times
 (how often depends on your program, but in the ball park of 100-1000).
 You can test stability by plotting productivity curves for your `fast` settings
-with the `prod.py` script attached to Trac #15999.
+with the `prod.py` script attached to #15999.

 If in doubt, ask Sebastian Graf for help.