GitHub - iamrahulreddy/cipher: This Graph RAG Application is a web-based tool that allows users to ask questions about the Mission Impossible film franchise and receive detailed, contextually relevant answers. By combining retrieval-based methods with generative AI, the application ensures that responses are both accurate and engaging.

Mission Cipher 🕵️‍♂️

Welcome! This project is a labor of love, inspired by my deep admiration for the Mission: Impossible film franchise. Whether you're a die-hard fan or just curious about the intricate world of espionage, this application is designed to provide you with insightful and engaging information about the characters and plots from the Mission: Impossible universe.

Live Demo

Check out the live demo of the Project Cipher at cipher.neuralnets.dev to experience it in action!

Note 📝

I have only used Mission: Impossible film data for embedding and retrieving content, so this terminal only contains data specifically from Mission: Impossible films. Inquiries outside this scope cannot be answered.

Overview

Mission Cipher is a Graph Retrieval-Augmented Generation (GraphRAG) application designed to navigate complex information within the Mission: Impossible universe. Unlike conventional RAG systems that rely solely on text similarity, GraphRAG builds a structured knowledge graph to enable multi-step reasoning and relationship-aware retrieval.

Core Technologies

Graph Processing (NetworkX): Entities and events are modeled as nodes, relationships as typed edges, allowing traversal and multi-hop lookups.
Embeddings & Vector Matching (NumPy, Scikit‑learn): Semantic vectors are generated for entities and enriched with local context to enable graph-aware retrieval.
Large Language Model (Google Gemini API): Used for entity recognition, relationship extraction, and natural language response generation.
Frontend Interface (React, Tailwind CSS): Offers a styled terminal-like chat interface with live streaming responses.
Backend Framework (Flask): Handles query intake, subgraph assembly, and API endpoints for both interactive and programmatic access.

Why GraphRAG?

GraphRAG excels at:

Disambiguating entities by leveraging their relational context.
Performing multi-hop reasoning (e.g., character ➝ event ➝ organization).
Supporting thematic queries that go beyond keyword matching.
Assembling structured context for more complete, accurate responses.

System Comparison

Feature	Standard RAG	GraphRAG
Entity/Relationship Awareness	Relies purely on text similarity	Embeds relationships in graph structure
Multi-Hop Reasoning	Not supported	Designed for recursive traversal and multi-step retrieval
Contextual Disambiguation	Often ambiguous with similarly named entities	Considers connected entities for clarity
Thematic Exploration	Limited to surface content retrieval	Traverses narrative relationships for deeper insight

Architecture Overview

Frontend Layer

Built with React and styled via Tailwind CSS:

Simulates a command-line chat terminal
Streams responses as they are generated
Sends user queries to backend and renders JSON results

Backend Layer

Implemented using Flask:

Receives /query, /graph-stats, /health endpoints
Converts queries into embeddings
Matches top entities via cosine similarity
Expands to context-aware subgraph via NetworkX
Compiles context and invokes Google Gemini for response generation

Data Layer

Knowledge graph built offline via build_graph.py
Embeddings stored and loaded during runtime
Graph entities and edges serialized for rapid access

Processing Pipeline

Offline Graph Construction

Executed via build_graph.py, step-by-step:

Load documents from structured JSON files
Extract entities using Gemini LLM
Identify entity relationships (e.g., "betrayed", "member of", "led by")
Create a MultiDiGraph of entities and their typed connections
Generate entity embeddings augmented with local subgraph context
Serialize graph structure and embeddings for runtime usage

Runtime Query Flow (`app.py`)

Convert incoming user query into semantic embedding
Retrieve top-k matching entities
Expand to immediate relational neighborhood in the graph
Build context from entity nodes and relationship edges
Assemble and prime prompt for LLM response
Return real-time, streamed responses to the frontend

API Specification

Method	Endpoint	Description	Response Format
GET	`/`	Returns the chat application interface	HTML
POST	`/query`	Accepts user prompt and returns generated answer	JSON
GET	`/graph-stats`	Provides metadata such as node and edge counts	JSON
GET	`/health`	Returns simple health check (status OK)	JSON

Example usage:

curl -X POST https://cipher.neuralnets.dev/query \
  -H "Content-Type: application/json" \
  -d '{"query": "Explain the betrayal of Jim Phelps"}'

Installation & Local Setup

Prerequisites

Python ≥ 3.8
Google Gemini API key
Node.js (for frontend)

Setup Steps

git clone https://github.com/iamrahulreddy/cipher.git
cd cipher
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env

Add your Gemini key to .env:

GEMINI_API_KEY=YOUR_API_KEY_HERE

Then run:

python build_graph.py    # Build graph and embeddings
python app.py            # Start API server

Visit: http://localhost:5000

Production Deployment (High-Level Guidance)

⚠️ These steps are generalized—adapt to your specific infrastructure and security requirements.

Requirements

Python ≥ 3.8
A production WSGI server such as Gunicorn
A reverse proxy server (e.g., Nginx, Apache)
SSL certificate (LetsEncrypt recommended)
At least 2 GB RAM dedicated for graph loading

Example Deployment Workflow

Install dependencies and Gunicorn:
```
pip install -r requirements.txt
```

Configure your Nginx to proxy incoming traffic to Gunicorn:

server {
  listen 80;
  server_name your_domain.com;
  location / {
    proxy_pass http://127.0.0.1:8000;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
  }
}

(Optional) Set up HTTPS via Certbot.
Create a systemd unit file to run Gunicorn as a service and enable it on startup.

Performance Evaluation

I have compared GraphRAG responses Standard RAG using real-world franchise queries. In 4 out of 5 cases, GraphRAG produced superior answers—especially on multi-hop, thematic queries. See the below files for more info

Technology Stack

Component	Technology
Backend Framework	Flask
Frontend	React, Tailwind CSS
Graph Engine	NetworkX
Semantic Embeddings	NumPy, Scikit-learn
Language Model	Google Gemini API
Data / Entity Tasks	Pandas

Contributing

Contributions are welcome! If you have any ideas, suggestions, or bug reports, please open an issue or submit a pull request.

License

This project is licensed under the MIT License. See LICENSE.

References

Thank you for checking out the Mission Cipher I hope you enjoy exploring the world of Ethan Hunt and his team as much as I enjoyed creating this project. 🕵️‍♂️🎬

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
graph_data		graph_data
interactions		interactions
templates		templates
.gitignore		.gitignore
Comparison_REPORT.md		Comparison_REPORT.md
LICENSE		LICENSE
README.md		README.md
app.py		app.py
build_graph.py		build_graph.py
documents.json		documents.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mission Cipher 🕵️‍♂️

Live Demo

Note 📝

Overview

Core Technologies

Why GraphRAG?

System Comparison

Architecture Overview

Frontend Layer

Backend Layer

Data Layer

Processing Pipeline

Offline Graph Construction

Runtime Query Flow (`app.py`)

API Specification

Installation & Local Setup

Prerequisites

Setup Steps

Production Deployment (High-Level Guidance)

Requirements

Example Deployment Workflow

Performance Evaluation

Technology Stack

Contributing

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

iamrahulreddy/cipher

Folders and files

Latest commit

History

Repository files navigation

Mission Cipher 🕵️‍♂️

Live Demo

Note 📝

Overview

Core Technologies

Why GraphRAG?

System Comparison

Architecture Overview

Frontend Layer

Backend Layer

Data Layer

Processing Pipeline

Offline Graph Construction

Runtime Query Flow (app.py)

API Specification

Installation & Local Setup

Prerequisites

Setup Steps

Production Deployment (High-Level Guidance)

Requirements

Example Deployment Workflow

Performance Evaluation

Technology Stack

Contributing

License

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Runtime Query Flow (`app.py`)

Packages