Skip to content

Supplemental material for reproducing the results reported our paper about engineering Yannakakis in column stores.

Notifications You must be signed in to change notification settings

UHasselt-DSI-Data-Systems-Lab/code-reproducability-yannakakis

Repository files navigation

Reproducability Package - Yannakakis in Column Stores

This repository contains supplemental material for reproducing the results reported our paper Instance-Optimal Acyclic Join Processing Without Regret: Engineering the Yannakakis Algorithm in Column Stores. Full version of the paper can be found here.

The repository is organized as follows:

Do not forget to clone submodules well when cloning this repository. You can do this by adding the --recursive flag to the git clone command:

git clone --recursive <URL>

Analyze Results

If you only want to analyze our results reported in the paper, you can inspect the results in the experiments folder. For the CE benchmark, the output folder was too large for GitHub, so we only uploaded a compressed version. You should extract it (in the same folder) before running the analysis notebook.

Run Experiments

Be aware that results might differ from machine to machine. We conducted all experiments on a Ubuntu 22.04.4 LTS machine with an Intel Core i7-11800 CPU and 32GB of RAM. Prerequisites for running the experiments are (between brackets are the versions we used):

  • Rust (1.74.1) with nightly toolchain
  • Python (3.10.12) & pip (22.0.2)
  • Make (4.3) & Ninja (1.10.1)

We provided some bash script to make it easier to run the experiments: setup.sh, clean.sh, and run_queries.sh. They should be executed from the root directory and also in the following order:

# cwd = root directory of this repository

# Install required pip packages
# (Create a new virtual environment if desired)
pip install -r requirements.txt

# Download datasets, build DuckDB, build Yannakakis implementation,...
chmod +x setup.sh
./setup.sh

# Remove all experiment results that are currently in the repo (query plans, csv files, ...)
chmod +x clean.sh
./clean.sh

# Run all benchmark queries (DuckDB-bin, Datafusion-bin, Datafusion-2NSA)
chmod +x run_queries.sh
./run_queries.sh

You can now run, for each benchmark, the notebook experiments/<benchmark>/analysis.ipynb to analyze the results.

About

Supplemental material for reproducing the results reported our paper about engineering Yannakakis in column stores.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages