Commit 9c2966dc authored by Sebastian Graf's avatar Sebastian Graf

Add section about benchmark categories to README.md

parent 0675e131
Pipeline #4020 passed with stage
in 5 minutes and 56 seconds
......@@ -150,11 +150,51 @@ If you add a benchmark try to set the problem sizes for
fast/normal/slow reasonably. [Modes](#modes) lists the recommended brackets for
each mode.
### Benchmark Categories
So you have a benchmark to submit but don't know in which subfolder to put it? Here's some
advice on the intended semantics of each category.
#### Single threaded benchmarks
These are run when you just type `make`. Their semantics is explained in
[the Nofib paper](https://link.springer.com/chapter/10.1007%2F978-1-4471-3215-8_17)
(You can find a .ps online, thanks to @bgamari. Alternatively grep for
'Spectral' in docs/paper/paper.verb).
- `imaginary`: Mostly toy benchmarks, solving puzzles like n-queens.
- `spectral`: Algorithmic kernels, like FFT. If you want to add a benchmark of a
library, this most certainly the place to put it.
- `real`: Actual applications, with a command-line interface and all. Because of
the large dependency footprint of today's applications, these have become
rather aged.
- `shootout`: Benchmarks from
[the benchmarks game](https://benchmarksgame-team.pages.debian.net/benchmarksgame/),
formerly known as "language shootout".
Most of the benchmarks are quite old and aren't really written in way one would
write high-performance Haskell code today (e.g., use of `String`, lists,
redefining own list combinators that don't take part in list fusion, rare use of
strictness annotations or unboxed data), so new benchmarks for the `real` and
`spectral` in brackets in particular are always welcome!
#### Other categories
Other than the default single-threaded categories above, there are the
following (SG: I'm guessing here, have never run them):
- `gc`: Run by `make -C gc` (though you'll probably have to edit the Makefile to
your specific config). Select benchmarks from `spectral` and `real`, plus a
few more (Careful, these have not been touched by #15999/!5, see the next
subsection). Testdrives different GC configs, apparently.
- `smp`: Microbenchmarks for the `-threaded` runtime, measuring scheduler
performance on concurrent and STM-heavy code.
### Stability wrt. GC paramerisations
Additionally, pay attention that your benchmarks are stable wrt. different
GC paramerisations, so that small changes in allocation don't lead to big,
unexplicable jumps in performance. See Trac #15999 for details. Also make sure
unexplicable jumps in performance. See #15999 for details. Also make sure
that you run the benchmark with the default GC settings, as enlarging Gen 0 or
Gen 1 heaps just amplifies the problem.
......@@ -164,6 +204,6 @@ working set grows and shrinks (e.g. is approximately constant) over the whole
run of the benchmark. You can ensure this by iterating your main logic $n times
(how often depends on your program, but in the ball park of 100-1000).
You can test stability by plotting productivity curves for your `fast` settings
with the `prod.py` script attached to Trac #15999.
with the `prod.py` script attached to #15999.
If in doubt, ask Sebastian Graf for help.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment