The experiments are mainly run on a CloudLab c6420 physical server (amd64, see detailed specs in CloudLab documentation), with an Ubuntu 22.04 image.
Other Debian-based distributions and virtual machines should also work but less tested. Containers, other Linux distributions or operating systems or other architectures are never guaranteed to work.
Warning
sudo
is used in setup and test scripts. It is recommended to use
one-off machines like CloudLab or virtual machines.
wget 'https://github.com/xlab-uiuc/DebCovDiff/blob/main/diff/scripts/setup.sh?raw=true' -O- | bash
As prompted, log out the current shell and log back in again, to make sure
you are correctly in sbuild
group (check via id -nG | grep sbuild
).
This script notably does the following:
-
Build GCC (
c1fb78fb
+ changes) and LLVM (ce9a2c65
+ changes) -
Run
sbuild-createchroot
for each toolchain which (1) sets up an isolatedchroot
and (2) runsdebootstrap
in it. This step creates the environment for general Debian package build.For reproducibility, all Debian binaries and source files are pinned to snapshots in December 2024 (Debian 12.8) via
https://snapshot.debian.org/
instead of the regularhttp://deb.debian.org/debian
. -
Copy custom toolchains (step 1) into the
chroot
s and rewire GCC invocations to the desired toolchain with appropriate flags, via hook scripts embedded inconfigure-all-chroot.sh
.
AUTO_TESTS=1 \
ALL_METRICS=1 \
LOG_LEVEL=warning \
SHOW_SOURCE=1 \
START_WITH="download_source" \
$REPO_DIR/diff/scripts/debian-diff.sh procps
Options:
procps
: Debian source package (not binary package) nameLOG_LEVEL=<level>
: one oferror
,warning
,info
,debug
SHOW_SOURCE=1
: show the inconsistent source code snippetALL_METRICS=1
: warn of inconsistencies for all metrics. Otherwise this is configurable ininconsistency.py
.START_WITH=<mode>
"download_source"
: run everything (starting with pulling the source from Debian) at a new directory/var/lib/sbuild/build/<package>-<toolchain>-<new_id>
"test"
: if the package has been run before, skip downloading the source and build, but directly run tests, generate coverage reports and perform differential testing. Everything happens in the old directory/var/lib/sbuild/build/<package>-<toolchain>-<old_id>
"diff"
: if the package has been run before, skip everything but the final differential testing step, based on the existing coverage reports under/var/lib/sbuild/build/<package>-<toolchain>-<old_id>
AUTO_TESTS=1
: measure coverage ofdh_auto_test
if available. Otherwise invoke simple commands as specified in./debian/scripts/chroot/
.
Back up and clean build directory before moving on to next sections
cp -r /var/lib/sbuild/build/ /var/lib/sbuild/build-individual-packages
rm -rf /var/lib/sbuild/build/*
export AUTO_TESTS=1
$REPO_DIR/diff/scripts/debian-batch.sh
Back up and clean build directory
mv $(ls -dt /var/lib/sbuild/build-* | head -2 | tail -1) /var/lib/sbuild/build-ET
rm -rf /var/lib/sbuild/build/*
export AUTO_TESTS=0
$REPO_DIR/diff/scripts/debian-batch.sh
Back up and clean build directory
mv $(ls -dt /var/lib/sbuild/build-* | head -2 | tail -1) /var/lib/sbuild/build-SC
rm -rf /var/lib/sbuild/build/*
The results (with one package or all packages) are generated under
/var/lib/sbuild/build*
directories:
build
: the most recent run, where all actual builds happenbuild-<date>-<random>
: a copy ofbuild
after batch runbuild-{individual-packages,ET,SC}
: renamed to more recognizable names in previous sections.
These directories have the following structure:
For each package there are two or four directories
<package>-gcc-1 // build and measure coverage using GCC
<package>-clang-1 // build and measure coverage using Clang/LLVM
// (The below two only exist for batch runs)
<package>-log // various logs
<package> // historic inconsistencies
Take package grep
for example
(+
means files created by our tool in addition to a regular Debian package build)
/var/lib/sbuild/build-SC/grep-clang-1/
├── grep-3.8/
│ ├── Makefile, configure, ... // The upstream source code, e.g. by GNU developers
│ ├── debian/ // Configuration, patches etc by downstream Debian developers
│ ├── dh_auto_test.log // + Log of running dh_auto_test
│ └── llvm-cov-profraw/ // + *.profraw files generated during test
│
├── llvm-cov-executables.txt // + List of instrumented executables and libraries,
│ // which is passed as argument to llvm-cov
├── default.profdata // + Merged and indexed result of all *.profraw files
├── default.json // + JSON coverage report
├── default.lcov.txt // + LCOV coverage report
├── text-coverage-report/ // + Text coverage report organized by source structure
├── text-coverage-report.txt // + Text coverage report concatenated in one file
│
├── // Various Debian build artifacts
├── grep_3.8-5.debian.tar.xz
├── grep_3.8-5.disc
├── grep_3.8-5_amd64.buildinfo
├── grep_3.8-5_amd64.changes
├── grep_3.8-5_amd64.deb
└── ...
Due to the tools' nature, coverage related files are found at different places for GCC.
/var/lib/sbuild/build-SC/grep-gcc-1/
├── grep-3.8/
│ ├── Makefile, configure, ... // The upstream source code, e.g. by GNU developers
│ ├── src/ // The upstream source code, e.g. by GNU developers
│ │ ├── grep.c // The upstream source code, e.g. by GNU developers
│ │ ├── grep.o // Build files
│ │ ├── grep.{gcda,gcno} // + gcov note and data files that spread across the
│ │ │ // whole build directory, usually found next to
│ │ │ // the corresponding *.o file
│ │ ├── lib/ // The upstream source code, e.g. by GNU developers
│ │ │ ├── fcntl.c // The upstream source code, e.g. by GNU developers
│ │ │ ├── libgreputils_a-fcntl.o // Build files
│ │ │ └── libgreputils_a-fcntl.{gcda,gcno} // + Another example of gcov note and data files
│ │ └── ...
│ ├── debian/ // Configuration, patches etc by downstream Debian developers
│ ├── dh_auto_test.log // + Log of running dh_auto_test
│ │
│ ├── grep.c.gcov // + Text coverage report
│ ├── grep.c.gcov.json // + JSON coverage report
│ │
│ ├── fcntl.c.gcov // + Another example of coverage reports.
│ ├── libgreputils_a-fcntl.gcov.json // Note they are no longer reflecting
│ │ // the source structure
│ └── ...
│
├── // Various Debian build artifacts
├── grep_3.8-5.debian.tar.xz
├── grep_3.8-5.disc
├── grep_3.8-5_amd64.buildinfo
├── grep_3.8-5_amd64.changes
├── grep_3.8-5_amd64.deb
└── ...
/var/lib/sbuild/build-SC/grep-log/
├── 1.clang_build_log.txt
├── 1.gcc_build_log.txt
├── 1.compared.csv // Total compared lines, branches and MC/DC decisions
├── 1.inconsistent.csv // Inconsistent lines, branches and MC/DC decisions
├── 1.diff_log.txt // Verbose log of inconsistencies
├── ...
├── T.clang_build_log.txt // T means repeated runs
├── T.gcc_build_log.txt
├── T.compared.csv
├── T.inconsistent.csv
└── T.diff_log.txt
config.json
.
Bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120321
Example occurrence in Debian packages:
-
Code location:
/var/lib/sbuild/build-SC/gzip-gcc-1/gzip-1.12/gzip.c:467
-
Coverage report location:
/var/lib/sbuild/build-SC/gzip-gcc-1/gzip-1.12/builddir/gzip.c.gcov:507
10: 464: z_suffix = Z_SUFFIX; 10: 465: z_len = strlen(z_suffix); -: 466: 6: 467: while (true) { -: 468: int optc; 16: 469: int longind = -1; -: 470: 16: 471: if (env_argv)
Bug report: llvm/llvm-project#131505
Example occurrence in Debian packages:
-
Code location:
/var/lib/sbuild/build-SC/hostname-clang-1/hostname-3.23+nmu1/hostname.c:312
-
Coverage report location:
/var/lib/sbuild/build-SC/hostname-clang-1/text-coverage-report/coverage/build/hostname-clang-1/hostname-3.23+nmu1/hostname.c.txt:611
311| 0| sin6 = (struct sockaddr_in6 *)ifap->ifa_addr; 312| 0| if (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr) || ------------------------------- | Branch (312:10): [True: 0, False: 0] ------------------------------- 313| 0| IN6_IS_ADDR_MC_LINKLOCAL(&sin6->sin6_addr)) ------------------------------- | Branch (313:8): [True: 18446744073709551615, False: 1] ------------------------------- |---> MC/DC Decision Region (312:10) to (313:32) | | Number of Conditions: 2 | Condition C1 --> (312:10) | Condition C2 --> (313:8) | | Executed MC/DC Test Vectors: | | None. | | C1-Pair: not covered | C2-Pair: not covered | MC/DC Coverage for Decision: 0.00% | ------------------------------- 314| 0| continue; 315| 0| }
(We patched LLVM in our experiments for debugging purposes, so that it expands "18.4E" into full digits)
Build Csmith
pushd /tmp/
git clone https://github.com/csmith-project/csmith.git
cd csmith
git checkout 0ec6f1bad2df865beadf13c6e97ec6505887b0a5
cmake -D CMAKE_C_COMPILER=/usr/bin/gcc -D CMAKE_CXX_COMPILER=/usr/bin/g++ .
make -j10
sudo make -j10 install
popd
cd $REPO_DIR/csmith
For each Csmith configuration (default, --inline-function
, and --lang-cpp
),
-
Generate (~21min) and check (~3min) 1,000 programs
python gen.py --first-1k --nproc=40 bash check-1k.sh
Expected output:
gcc117412 0/1000 0/1000 0/1000 gcc117415 0/1000 0/1000 0/1000 gcc120319 0/1000 0/1000 0/1000 gcc120321 0/1000 0/1000 0/1000 gcc120332 769/1000 765/1000 772/1000 gcc120348 0/1000 0/1000 0/1000 gcc120478 0/1000 0/1000 0/1000 gcc120482 0/1000 0/1000 0/1000 gcc120484 841/1000 841/1000 855/1000 gcc120486 0/1000 0/1000 0/1000 gcc120489 798/1000 803/1000 818/1000 gcc120490 0/1000 0/1000 0/1000 gcc120491 0/1000 0/1000 0/1000 gcc120492 0/1000 0/1000 0/1000 llvm105341 0/1000 0/1000 0/1000 llvm114622 1000/1000 1000/1000 1000/1000 llvm116884 0/1000 0/1000 0/1000 llvm140427 0/1000 0/1000 0/1000
-
Generate (~9.5h) and check (~5h) 100,000 programs
python gen.py --nproc=40 bash check-100k.sh
Expected output:
gcc117412 0/100000 0/100000 0/100000 gcc117415 0/100000 0/100000 0/100000 gcc120319 0/100000 0/100000 0/100000 gcc120321 0/100000 0/100000 0/100000 gcc120332 75674/100000 75534/100000 76193/100000 gcc120348 0/100000 0/100000 0/100000 gcc120478 0/100000 0/100000 0/100000 gcc120482 0/100000 0/100000 0/100000 gcc120484 82831/100000 82838/100000 83121/100000 gcc120486 0/100000 0/100000 0/100000 gcc120489 79105/100000 79001/100000 79310/100000 gcc120490 0/100000 0/100000 0/100000 gcc120491 0/100000 0/100000 0/100000 gcc120492 0/100000 0/100000 0/100000 llvm105341 0/100000 0/100000 0/100000 llvm114622 100000/100000 100000/100000 100000/100000 llvm116884 0/100000 0/100000 0/100000 llvm140427 0/100000 0/100000 0/100000
cd $REPO_DIR/bug-ages
docker build -t old-compilers-env .
docker run -it --rm -v $PWD:/usr/src/app old-compilers-env
Inside the container
bash bugs/setup-links.sh
gcc --version
clang --version
bash bugs/run.sh |& tee log.txt
grep 'NOT REPRODUCING' log.txt
grep ' OK' log.txt
cd $REPO_DIR/tables-and-figures/scripts
python run.py