README 1.62 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
NDP benchmarks
==============

This directory contains several NDP benchmarks:

concomp    - connected components in undirected graphs
dotp       - dot product of two vectors
primes     - sieve of Eratosthenes
smvm       - sparse matrix/vector multiplication

Options
-------

The following options are common to all benchmarks:

  --runs=N                Repeat each benchmark N times
  -r N

  --threads=N             Use N threads
  -t N

  --seq=N                 Simulate N threads
  -s N

  --algo=ALGORITHM        Use the specified algorithm (if the benchmark
  -a ALGORITHM            implements multiple algorithms)

  --verbose=N             Set the verbosity level
  -v N

  --help                  Show a help screen

Running benchmarks
------------------

For parallel benchmarks, you usually want to use

  benchmark --threads=<N> --runs=<R> <INPUT> +RTS -N<T>

Here, N is the number of threads to use and R the number of times the
benchmark should be repeated (you probably want something between 3 and 10).

The output will look as follows:

  ....: wall_best/cpu_best wall_avg/cpu_avg wall_worst/cpu_worst

Here, wall_{best|avg|worst} is the best, average and worst wall-clock time,
respectively; cpu_{best|avg|worst} is the CPU time. Note that for parallel
benchmarks on a multiprocessor, the wall-clock time will typically decrease
with more threads whereas the CPU time will slightly increase. 

For sequential benchmarks, the number of threads does not have to be
specified, i.e., --threads and +RTS -N can be omitted.

At higher verbosity levels, more information (in particular, the timings of
the individual runs) will be displayed.