PDF Document Q&A System with RAG and Gemini

Intelligent document analysis powered by RAG and Google's Gemini AI

Features

📄 PDF document processing with automatic chunking
💡 Natural language question answering
🎯 Source citations with page numbers
🔍 AI-powered analysis:
- Sentiment analysis
- Topic modeling
- Key insights extraction
- Contextual relevance scoring

Quick Start

Prerequisites

Python 3.11+
Google API key (Get it here)

Document Placement

You can add documents in two ways:

Through the Web Interface
- Use the drag & drop interface
- Maximum 16MB per file

Manual Placement

# Place PDF files directly in the docs folder
project_root/
└── docs/
    ├── document1.pdf
    ├── document2.pdf
    └── ...

The system will automatically process any PDF files in the docs folder on startup.

Installation

Setup Environment

# Clone and enter directory
git clone <repository-url> && cd <project-folder>

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Configure API Key

# Create .env file in backend directory
echo "GOOGLE_API_KEY=your_key_here" > backend/.env

Run Application

# Start server
chmod +x start.sh && ./start.sh

# Open in browser
http://localhost:5001

Usage

Upload Documents
- Drag & drop PDFs or click to browse
- Max file size: 16MB per PDF
- Supports text-based PDFs
Ask Questions
- Type your question
- Get answers with source citations
- View AI-powered insights

Configuration

Key parameters can be adjusted in:

# document_processor.py
chunk_size = 800        # Text chunk size
chunk_overlap = 50      # Overlap between chunks

# rag_pipeline.py
k = 8                  # Number of chunks to retrieve

Troubleshooting

Common issues and solutions:

Port in Use: Change port in start.sh or kill existing process
PDF Errors: Ensure PDFs are text-based (not scanned)
API Issues: Verify API key and quota limits

API Endpoints

POST /api/upload    # Upload PDF files
POST /api/ask      # Query documents
GET  /api/stats    # Get document statistics

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
backend		backend
docs		docs
frontend		frontend
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Document Q&A System with RAG and Gemini

Features

Quick Start

Prerequisites

Document Placement

Installation

Usage

Configuration

Troubleshooting

API Endpoints

About

Releases

Packages

Languages

Damien3008/RAG

Folders and files

Latest commit

History

Repository files navigation

PDF Document Q&A System with RAG and Gemini

Features

Quick Start

Prerequisites

Document Placement

Installation

Usage

Configuration

Troubleshooting

API Endpoints

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages