This project implements the core ideas of Omni-RAG (Dong et al. 2025): a Retrieval-Augmented Generation (RAG) system for real-world, noisy, and multi-intent queries.
- LLM-based Query Understanding: Rewriting and decomposing user queries into sub-queries
- Intent-Aware Retrieval: Vector search over external knowledge for each sub-query (e.g., FineWeb)
- Reranking & Generation:
BAAI/bge-base-en-v1.5
reranker and GPT-4.1-nano for final answer
# Clone repo and install dependencies
pip install -r requirements.txt
# Set up database and environment variables
cd docker
docker compose up -d
- Create a
.env
file with:TIMESCALE_SERVICE_URL
OPENAI_API_KEY
- Setup DB and upload data:
python src/upload_vectors.py
- Run the main pipeline:
python src/main.py
- Python 3.10+
- TimescaleDB (PostgreSQL)
- Docker
- FineWeb Dataset
- GPT-4.1-nano
- text-embedding-3-small
- BAAI/bge-base-en-v1.5
openai
numpy
timescale-vector
pandas
dotenv
sentence-transformers
transformers
pydantic
datasets
- Dong, G. et al. (2025): Omni-RAG: Leveraging LLM-Assisted Query Understanding for Live Retrieval-Augmented Generation
- Dave Ebbelaar – pgvectorscale‑rag‑solution
- Ran, K. et al. (2025): RMIT-ADM+S at the SIGIR 2025 LiveRAG Challenge
Research prototype inspired by the official Omni-RAG pipeline.