README.md 6.69 KB
Newer Older
1 2 3
# NoFib: Haskell Benchmark Suite

This is the root directory of the "NoFib Haskell benchmark suite". It
4 5
should be part of a GHC source tree, that is the 'nofib' directory
should be at the same level in the tree as 'compiler' and 'libraries'.
6 7 8 9 10 11 12 13
This makes sure that NoFib picks up the stage 2 compiler from the
surrounding GHC source tree.

You can also clone this repository in isolation, in which case it will
pick `$(which ghc)` or whatever the `HC` environment variable is set to.

Additional information can also be found on
[NoFib's wiki page](https://ghc.haskell.org/trac/ghc/wiki/Building/RunningNoFib).
14

15 16
## Using

17 18 19 20 21 22 23 24 25 26 27 28 29
<details>
  <summary>Git symlink support for Windows machines</summary>
  
  NoFib uses a few symlinks here and there to share code between benchmarks.
  Git for Windows has symlinks support for some time now, but
  [it may not be enabled by default](https://stackoverflow.com/a/42137273/388010).
  You will notice strange `make boot` failures if it's not enabled for you.
  
  Make sure you follow the instructions in the link to enable symlink support,
  possibly as simple as through `git config core.symlinks true` or cloning with
  `git clone -c core.symlinks=true <URL>`.
</details>

30
Install [`cabal-install-2.4`](https://www.haskell.org/cabal/download.html) or later.
31 32 33 34

Then, to run the tests, execute:

```
35 36 37 38 39 40 41
$ make clean # or git clean -fxd, it's faster
$ # Generates input files for the benchmarks and builds compilation
$ # dependencies for make (ghc -M)
$ make boot
$ # Builds the benchmarks and runs them $NoFibRuns (default: 5) times
$ make
```
42

tibbe's avatar
tibbe committed
43 44
This will put the results in the file `nofib-log`. You can pass extra
options to a nofib run using the `EXTRA_HC_OPTS` variable like this:
45

46 47 48 49 50
```
$ make clean
$ make boot
$ make EXTRA_HC_OPTS="-fllvm"
```
51

52
To compare the results of multiple runs, save the output in a logfile
53
and use the program in `./nofib-analyse/nofib-analyse`, for example:
54

55 56 57 58 59 60 61
```
...
$ make 2>&1 | tee nofib-log-6.4.2
...
$ make 2>&1 | tee nofib-log-6.6
$ nofib-analyse nofib-log-6.4.2 nofib-log-6.6 | less
```
62

tibbe's avatar
tibbe committed
63 64
to generate a comparison of the runs in captured in `nofib-log-6.4.2`
and `nofib-log-6.6`. When making comparisons, be careful to ensure
65
that the things that changed between the builds are only the things
66
that you _wanted_ to change. There are lots of variables: machine,
67
GHC version, GCC version, C libraries, static vs. dynamic GMP library,
68
build options, run options, and probably lots more. To be on the safe
69
side, make both runs on the same unloaded machine.
70

71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
## Modes

Each benchmark is runnable in three different time `mode`s:

- `fast`: 0.1-0.2s
- `norm`: 1-2s
- `slow`: 5-10s

You can control which mode to run by setting an additional `mode` variable for
`make`. The default is `mode=norm`. Example for `mode=fast`:

```
$ make clean
$ make boot mode=fast
$ make mode=fast
```

Note that the `mode`s set in `make boot` and `make` need to agree. Otherwise you
will get output errors, because `make boot` will generate input files for a
different `mode`. A more DRY way to control the `mode` would be

```
$ make clean
$ export mode=fast
$ make boot
$ make
```

As CPU architectures advance, the above running times may drift and
occasionally, all benchmarks will need adjustments.

Be aware that `nofib-analyse` will ignore the result if it falls below 0.2s.
This is the default of its `-i` option, which is of course incompatible with
`mode=fast`. In that case, you should just set `-i` as appropriate, even
deactivate it with `-i 0`.

107 108 109 110 111 112 113 114 115 116 117 118 119
## Boot vs. benchmarked GHC

The `nofib-analyse` utility is compiled with `BOOT_HC` compiler,
which may be different then the GHC under the benchmark.

You can control which GHC you benchmark with `HC` variable

```
$ make clean
$ make boot HC=ghc-head
$ make HC=ghc-head 2>&1 | tee nofib-log-ghc-head
```

120 121 122
## Configuration

There are some options you might want to tweak; search for nofib in
tibbe's avatar
tibbe committed
123
`../mk/config.mk`, and override settings in `../mk/build.mk` as usual.
124 125 126

## Extra Metrics: Valgrind

127
To get instruction counts, memory reads/writes, and "cache misses",
tibbe's avatar
tibbe committed
128 129
you'll need to get hold of Cachegrind, which is part of
[Valgrind](http://valgrind.org).
130

131 132 133 134 135 136 137 138 139 140
You can then pass `-cachegrind` as `EXTRA_RUNTEST_OPTS`. Counting
instructions slows down execution by a factor of ~30. But it's
a deterministic metric, so you can combine it with `NoFibRuns=1`:

```
$ (make EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1) 2>&1 | tee nofib-log
```

Optionally combine this with `mode=fast`, see [Modes](#modes).

141
## Extra Packages
dterei's avatar
dterei committed
142

143 144 145
Some benchmarks aren't run by default and require extra packages are
installed for the GHC compiler being tested. These packages include:
 * stm - for smp benchmarks
dterei's avatar
dterei committed
146

147 148 149
## Adding benchmarks

If you add a benchmark try to set the problem sizes for
150 151 152
fast/normal/slow reasonably. [Modes](#modes) lists the recommended brackets for
each mode.

Sebastian Graf's avatar
Sebastian Graf committed
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176
### Categories

When you add a benchmark, you'll have to make a conscious decision in which to
(historical) category to put it. Here's a description for each, taken from the
original [Nofib paper](https://link.springer.com/chapter/10.1007%2F978-1-4471-3215-8_17):

#### Single-threaded

This is the default mode you want to add to most of the time.

- `imaginary`: A toy benchmark. Solvers for puzzles like n queens.
- `spectral`: Algorithmic kernels. If you have a library out of which you have
   extracted a core algorithm, put it here.
- `real`: Actual applications from the "real world". Because most applicationsof today depend on quite many packages, everything in here is rather dated.
- `shootout`: The (toy) benchmarks from the [benchmarks game](https://benchmarksgame-team.pages.debian.net/benchmarksgame/), formerly known as the language shootout.

#### Other modes

In addition to the above single-threaded mode, you can also add your benchmarks to one of the 
I (SG) haven't really run these, yet, but 

- `gc`: Measures the GC.
- 

177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193
### Stability wrt. GC paramerisations

Additionally, pay attention that your benchmarks are stable wrt. different 
GC paramerisations, so that small changes in allocation don't lead to big,
unexplicable jumps in performance. See Trac #15999 for details. Also make sure
that you run the benchmark with the default GC settings, as enlarging Gen 0 or
Gen 1 heaps just amplifies the problem.

As a rule of thumb on how to ensure this: Make sure that your benchmark doesn't
just build up one big data and consume it in a final step, but rather that the
working set grows and shrinks (e.g. is approximately constant) over the whole
run of the benchmark. You can ensure this by iterating your main logic $n times
(how often depends on your program, but in the ball park of 100-1000).
You can test stability by plotting productivity curves for your `fast` settings
with the `prod.py` script attached to Trac #15999.

If in doubt, ask Sebastian Graf for help.