CPROVER Benchmarking Toolkit

The benchmarking toolkit consists of three main components: All steps are performed by subcommands of the main cpbm command. Type cpbm help to get the list of all subcommands.

Contact Information

Please report problems, bugs, questions, suggestions to Michael Tautschnig.

Download

Download our benchmarking framework as tarball or as Debian/Ubuntu package. When using the .tar.gz archive, extract the archive and add cpbm-0.5 to your PATH enviroment variable. For the Debian/Ubuntu package: install it using dpkg -i cpbm_0.5-1_all.deb.

Available Benchmarks

Benchmarks consist of two files: (1) an original source archive as obtained from its original authors (we provide local backup copies below), without modification and (2) a corresponding patch set <NAME>.cprover-bm.tar.gz that contains all modifications, execution scripts, and other helpers.

Benchmark Original Source Patch Set
Linux 2.6.19/DDVerify kernel.org
(local copy)
Scratch Scratch
(local copy)
Conflict Driven Learning Leo's page
(local copy)
Loopfrog: Lincoln Labs Corpora Lincoln Labs Corpora
(local copy)
Loopfrog: Verisec Verisec Suite
(local copy)
NEC Labs: Atomicity Violations (1) Chao Wang
(local copy)
NEC Labs: Atomicity Violations (2) Chao Wang
(local copy)
LLBMC VSTTE'12 Benchmarks llbmc.org
(local copy)

Benchmark Execution Example 1: DDVerify

Download the patch set.

Download and unpack the kernel sources and apply the patch set

$ cpbm unpack \
  http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.19.tar.bz2 \
  linux-2.6.19-ddverify.cprover-bm.tar.gz

Enter the new directory

$ cd linux-2.6.19-ddverify

Run the default benchmark configuration and produce a LaTeX table

$ cprover/rules -j4 table

Produce an additional GNUplot script and display the graph

$ cprover/rules graph | gnuplot

Build a web page providing an overview table and all log files in cprover/results.satabs.dfl.web/:

$ cprover/rules web

Benchmark Execution Example 2

Take a look at the complete workflow using examples from the SAMATE Reference Dataset.

This ScreenFlow video requires a more recent version of the Adobe Flash Player to display. Please update your version of the Adobe Flash Player.

Patch Set Management

As described above, benchmarks consist of (1) an original source archive, without modification and (2) a corresponding patch set <NAME>.cprover-bm.tar.gz that contains all modifications, execution scripts, and other helpers.
This design is inspired by Debian v3 source packages and may thus sound familiar to people who are used to working with Debian packages.

(1) The original source archive must be a .zip, .tar.gz, .tgz, .tar.bz2, or .tar archive. If the authors of the software to be benchmarked do not offer such an archive, it should be built manually from whatever the authors provide as source.

(2) The patch set <NAME>.cprover-bm.tar.gz is created and managed by cpbm update as described below. The archive ships a cprover/ directory that may contain arbitrary files, but at least a Makefile cprover/rules must be provided. The patch set management scripts will populate a cprover/patches directory that precisely describes all modification to original sources.

The basic design of this archive solution is completely independent of the software to be benchmarked. Its main purpose is the precise description of changes to the original source that were made in order to benchmark some tool.

Patch set management commands

Usage Example

Creating a new benchmark suite, e.g., for the Linux kernel:

Download the original sources from kernel.org

Manually unpack them:

$ tar xjf linux-2.6.19.tar.bz2

We rename the directory to get proper benchmark name "linux-2.6.19-foo"

$ mv linux-2.6.19 linux-2.6.19-foo
$ cd linux-2.6.19-foo

Create the basic patch set linux-2.6.19-foo.cprover-bm.tar.gz

$ cpbm init ../linux-2.6.19.tar.bz2

Edit some source files ... and record patches

$ cpbm update ../linux-2.6.19.tar.bz2

To make the benchmark source and all patches available for others to use publish linux-2.6.19-foo.cprover-bm.tar.gz and the original source archive or its URL.

Using the benchmark suite:
$ cpbm unpack \
  http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.19.tar.bz2 \
  linux-2.6.19-foo.cprover-bm.tar.gz
$ cd linux-2.6.19-foo

Benchmark Execution

cpbm init, as described above, creates as template the cprover/rules file (a Makefile). For most basic use cases filling in the names of the benchmarks to be run (e.g., all C files without the .c suffix) in the BENCHMARKS = XXX line and choosing a suitable default configuration in CONFIG = XXX (see examples in there) will suffice to produce a working benchmark package.
The file may, however, be fully customized as deemed necessary. The only assumption made by cpbm is that cprover/rules clean performs a cleanup. Remember to run cpbm update before distributing the .cprover-bm.tar.gz file.

The actual benchmark execution is then triggered by

$ cprover/rules -j4 verify
to perform verification of all benchmark instances in 4 parallel threads of execution with the default configuration. This step first induces a build of each benchmark from source, if necessary.

To choose a different configuration for the verification run, override the CONFIG variable. For instance, to perform verification using CPAchecker's explicit analysis use

$ cprover/rules -j4 verify CONFIG=cpachecker.explicit

Benchmark Execution Helpers

Benchmark execution makes use of a number of helpers:

Result Evaluation

The names of all log files produced by cpbm run will be listed in cprover/verified.$CONFIG. Performing

$ cprover/rules csv

yields a CSV (comma-separated value) formatted summary of all results in cprover/results.$CONFIG.csv. This file may be read using most spreadsheet programs for further manual inspection and evaluation. With cpbm, however, also LaTeX tables, GNUplot graphs, or web pages may be produced:

$ cprover/rules table

yields a LaTeX summary of execution times and memory usage of all benchmark instances. With

$ cprover/rules graph

furthermore LaTeX/TikZ code is produced which may be included in publications or taken verbatim to produce PDF files.

$ cprover/rules web

collects all log files in a new directory cprover/results.$CONFIG.web plus an index.html file that contains an HTML formatted version of the CSV table. Each benchmark links to the respective log file.

These steps permits a number of customizations: