A Deep Dive into Cost Estimation Algorithms for Optimised Query Execution in PostgreSQL
Please note that this project currently only focuses and is adapted on the TPC-H dataset.
- User-friendly interactive interface for ease of interaction
- Easy to understand and interpret visualisations for Query Execution Plans for your SQL queries
- See query outputs in real time with almost zero latency, directly brought to you by PostgreSQL
- Cross platform - compatible with all Windows, Mac and Linux users.
We express our gratitude to Associate Professor Sourav Saha Bhowmick for his invaluable guidance, support and encouragement throughout the duration of the course CZ4031: Database System Principles, offered by the School of Computer Science and Engieering at NTU.
We would also like to mention some resources we found really interesting in clearing our concepts about Query Execution Plans and cost estimation based mathematics in PostgreSQL
- PostgreSQL Documentation
- PostgreSQL Github Repository (Mirror)
- ARENA: Alternative Relational Query Plan Exploration for Database Education
Streamlining with the project's guidelines, we have decided to use a full Python-based framework. Below is a summarised list of main Python libraries used in developing it.
Client: Gradio, Plotly
Server: Psycopg, NumPy, Pydantic
Go to the project directory
cd src
Install dependencies (preferably in a virtual environment, using Anaconda or Venv)
conda create --name postgres-qep-venv python=3.10
conda activate postgres-qep-venv
pip install -r requirements.txt
If running via CMD(Administrator)
python3 -m venv ./venv
.\venv\Scripts\activate.bat
pip install -r requirements.txt
Start the server and launch the web application on a new browser tab
python project.py