The distributed benchmarking system of Tactician.
- The repository and commit hash of a particular version of Tactician (or one of it's model plugins).
- A set of Opam packages to benchmark on that particular commit.
- Parameters governing which lemmas to benchmark and how long.
- A way to allocate resources either locally, remotely through ssh, or through a HPC cluster.
- Every requested lemma will be benchmarked, and the results will be uploaded to the 'benchmark-data' repository of the coq-tactician Github organization.
This software is designed to be run on a compute cluster (possibly consisting of one machine). We make some assumptions over this cluster:
- In order to install the software, the Opam package manager version 2.x needs to be available (you can check
the version by running
opam --version
). - During operation, a working version of Bubblewrap is required. You can check that Bubblewrap functions properly
by running
bwrap --dev-bind / / echo "bubblewrap is working!"
. - Additional required software is Git, Rsync and Bash.
- It is assumed that the cluster is configured uniformly. That is:
- This software should be installed on all nodes with the same version and in the same location. The easiest way to ensure this is using a shared filesystem. The alternative is to carefully install the software manually and keep it consistent.
- A storage location for temporary files is required that has the same path on every node. By default it is
assumed that this location is not on a shared filesystem. This assumption is governed by the
-shared-filesystem
flag. - Nodes should be connected to a shared network and be able to communicate with each other over TCP.
- The system needs the ability to spawn processes on other remote nodes, either through SSH or through commands provided by a HPC cluster orchestrator.
Make sure that you have an Opam switch available. Then clone this repository, and from the root of the repo run
opam install .
For details, please read the output of tactician-benchmark -help
. This ouput is also replicated below.
- Running a benchmark of coq-tactician-stdlib on a single computer with a time limit of 40 seconds.
A maximum of 16 cores are used for benchmarking (as ensured by
-maxrequests 16
and-max-running 16
). Additionally, the flag-delay-benchmark
ensures that benchmarking only starts after the initial compilation of Opam packages has finished. This ensures hat the CPU cycles taken by the initial compilation does not interfere with the benchmarking.tactician-benchmark \ -benchmark-data ./benchmark-data/ \ -compile-allocator ./benchmark-system/local/compile_allocator \ -bench-allocator ./benchmark-system/local/bench_allocator \ -max-requests 16 \ -max-running 16 \ -delay-benchmark \ -benchmark-target coq-tactician \ -benchmark-repo git+ssh://git@github.com/coq-tactician/coq-tactician \ -benchmark-commit f267afe31ab82b071d3f03d08bc4a6595c30b1d2 \ -benchmark-time 40 \ coq-tactician-stdlib
- Running a benchmark of coq-tactician-stdlib on a SLURM cluster with a time limit of 40 seconds. A maximum of
28 resource requests are allowed to be submitted to the SLURM queue at the same time. There is no limit on
the number of simultaneous allocated resources.
tactician-benchmark \ -benchmark-data ./benchmark-data/ \ -compile-allocator ./benchmark-system/slurm/compile_allocator \ -bench-allocator ./benchmark-system/slurm/bench_allocator \ -max-requests 28 \ -benchmark-target coq-tactician \ -benchmark-repo git+ssh://git@github.com/coq-tactician/coq-tactician \ -benchmark-commit f267afe31ab82b071d3f03d08bc4a6595c30b1d2 \ -benchmark-time 40 \ coq-tactician-stdlib
This has been copied from the output of tactician-benchmark -help
:
tactician-benchmark PACKAGE [PACKAGE ...]
-bench-allocator executable An executable file that allocates computational
resources for benchmarking. This file may be
executed many times, as governed by the flags
'-max-requests' and '-max-running'. When
executed, this file should request the needed
resources to run one or more Coq processes that
benchmark lemmas for a given amount of time.
Similar to '-compile-allocator', these
resources can be allocated local, remote
through ssh, or through a HPC cluster
scheduler. When the resources have been
allocated, the executable should first output a
line to stdout indicating the number of minutes
the resources can be used. The benchmarking
system may not be able to relinquish the
resources exactly within that time. A safety
margin of ~15 minutes is recommended. Then the
executable should output the prefixes of
commands that can be executed in order to
access the allocated resources (see
'-compile-allocator' for examples). A single
outputted prefix will be used to run a single
Coq benchmarking process. This series of
prefixes should be followed by the string
'done'. After the bencharking system is
finished with the resources it will print
'done' to the stdin of the executable. At that
point the executable should relinquish the
allocated resources and exit.
-benchmark-commit hash The hash of the commit that should be
benchmarked.
-benchmark-data dir Location of the benchmark-data storage
repository.
-benchmark-repo URI The repository where the benchmark target can
be found. Accepts any URI that is accepted by
Opam.
-benchmark-target string Name of the Opam package containing the model
that should be benchmarked. This can be any
package that depends on coq-tactician.
-benchmark-time int Time limit in seconds for synthesizing an
individual lemma.
-compile-allocator executable An executable file that allocates computational
resources for the initial compilation of Opam
packages. When executed, this file should
request the needed resources, which can come
either from the local machine, from a remote
machine through ssh, or from a HPC cluster
scheduler. When the resources have been
allocated, the executable should output the
prefix of a command that can be executed in
order to access the allocated resources on
stdout.
Example prefixes:
- For local resources, this prefix would be the
empty string
- For a remote node accessed through ssh, the
prefix would be 'ssh remote-node'
- For a SLURM cluster, where a job id has been
allocated, the prefix would be 'srun --ntasks=1
--jobid $SLURM_JOB_ID'
After benchmarking system is finished with the
resources, it will print 'done' to the stdin of
the executable. At that point, the executable
should relinquish the allocated resources and
exit.
-max-requests int The maximum number of benchmark resource
allocations that may be running at the same
time. Related to '-bench-allocator'. This
governs how much load may be applied to
queueing mechanism of the underlying HPC
cluster. The higher this number, the more
resources will be requested at the same time.
When this number is low, the benchmarking
system may be starved of resources on busy
clusters. When this number is high, this may
flood the underlying queue, which may annoy
other users of the cluster. NOTE: This flag
does not limit the total number of resources
allocated. For that, see '-max-running'.
[-build-dir dir] Location of the build. This directory will not
be cleaned up after the benchmark finishes.
Mutually exclusive with -tmp-dir.
[-debug] Show additional debug info on stderr.
[-delay-benchmark] Delay the benchmark until the initial build is
fully complete. Useful when the build process
may interfere with the benchmark timings.
[-exclude file] A file containing a line-separated list of
lemma names that should not benchmarked. Thos
lemmas are ignored.
[-exclude-regexp regexp] A regular expression matching lemmas. Only
lemmas not matching the expression are
benchmarked.
[-include file] A file containing a line-separated list of
lemma names that should be benchmarked. Other
lemmas
are ignored.
[-include-regexp regexp] A regular expression matching lemmas. Only
lemmas matching the expression are benchmarked.
[-inject vernacular] ... Inject Coq vernacular into the compilation and
benchmarking process. Typically used to specify
options. Can be repeated multiple times and
combined with -inject-file.
[-inject-file file] ... Inject a file containing Coq vernacular into
the compilation and benchmarking process.
Typically used to specify options. Can be
repeated multiple times and combined with
-inject-file.
[-max-running int] The maximum number of benchmark resources
requests that can be requested and running at
the same time. See also '-max-requests'.
WARNING: This number is infinite by default,
which is likely not appropriate for local
resources.
[-resume] Look up the previous (partial) benchmark with
the same parameters as specified in the current
invocation, and finish that benchmark appending
old the existing log files.
[-shared-filesystem] Assume that the scratch directory is hosted on
a shared filesystem. No copying of files
between hosts is performed.
[-tmp-dir dir] Location in which a temporary directory will be
created to store the build. If not supplied, it
is taken from $TMPDIR. After the benchmark is
finished, the directory is cleaned up. Mutually
exclusive with -build-dir.
[-build-info] print info about this build and exit
[-version] print the version of this build and exit
[-help] print this help text and exit
(alias: -?)