LIKWID
likwid-bench

Information

likwid-bench is a benchmark suite for low-level (assembly) benchmarks to measure bandwidths and instruction throughput for specific instruction code on x86 systems. The currently included benchmark codes include common data access patterns like load and store but also calculations like vector triad and sum. likwid-bench includes architecture specific benchmarks for x86, x86_64 and x86 for Intel Xeon Phi coprocessors. The performance values can either be calculated by likwid-bench or measured using hardware performance counters by using likwid-perfctr as a wrapper to likwid-bench. This requires to build likwid-bench with instrumentation enabled in config.mk (INSTRUMENT_BENCH).

Options

Option Description
-h Print help message
-a List all available benchmarks
-p List all available thread affinity domains
-i <iters> Use <iters> iterations of the benchmark kernel
-d <delim> Use <delim> instead of ',' for the output of -p
-l <test> List characteristics of <test> like number of streams, data used per loop iteration, ...
-t <test> Perform assembly benchmark <test>
-s <min_time> Minimal time in seconds to run the benchmark.
Using this time, the iteration count is determined automatically to provide reliable results. Default is 1. If the determined iteration count is below 10, it is normalized to 10.
-w <workgroup> Set a workgroup for the benchmark. A workgroup can have different formats:
Format Description
<affinity_domain>:<size> Allocate in total <size> in affinity domain <affinity_domain>.
likwid-bench starts as many threads as available in affinity domain <affinity_domain>
<affinity_domain>:<size>:<num_threads> Allocate in total <size> in affinity domain <affinity_domain>.
likwid-bench starts <num_threads> in affinity domain <affinity_domain>
<affinity_domain>:<size>:<num_threads>:<chunk_size>:<stride> Allocate in total <size> in affinity domain <affinity_domain>.
likwid-bench starts <num_threads> in affinity domain <affinity_domain> with <chunk_size> selected in row and a distance of <stride>.
See CPU_expressions on the likwid-pin page for further information.
<above_formats>-<streamID>:<stream_domain> In combination with every above mentioned format, the test streams (arrays, vectors) can be place in different affinity domains than the threads.
This can be achieved by adding a stream placement option -<streamID>:<stream_domain> for all streams of the test to the workgroup definition.
The stream with <streamID> is placed in affinity domain <stream_domain>.
The amount of streams of a test can be determined with the -l <test> commandline option.

Examples

*/