RAG Model Architecture with LLM (Llama 3.1 from Ollama)

This project implements a Retrieval-Augmented Generation (RAG) system using a large language model (LLM) for handling queries on unstructured data (e.g., PDFs). The system provides context-aware responses with privacy-preserving features, such as PII redaction.

Overview

The architecture integrates several key components:

Database (e.g., PDFs)
PII Redaction using Faker
Context Construction using RAG
Large Language Model (LLM) Llama3.1

Each section below provides more details on these components and how they interact.

1. Database

The system relies on a database containing unstructured documents, such as PDF files. These documents contain the information that the user queries against.

Data type: PDFs or other unstructured data formats.
Role: The source of information that feeds into the system for generating responses.

2. PII Redaction (Faker)

Before processing any data, the system employs a PII (Personally Identifiable Information) redaction step to ensure that sensitive information is not exposed in the final responses.

Tool used: Faker library for redacting PII.
Function: Ensures that the context generated from the database is sanitized by replacing sensitive information with fake or anonymized data.
Input: Context extracted from the database relevant to the query.
Output: PII-redacted context passed on for further processing.

3. Context Construction (RAG)

The heart of the system is the RAG (Retrieval-Augmented Generation) model, which constructs a context-aware query by retrieving relevant information from the database.

Process:
- The query is sent to the RAG system, which looks up the relevant context from the PII-redacted database.
- The context relevant to the query is passed on to the LLM for generating a final response.
Input: Query from the user.
Output: A context-aware query sent to the LLM.

4. LLM

The final step in the system involves sending the context-aware query to the LLM (Llama 3.1 from Ollama). The LLM generates a response based on the provided context.

Model used: Llama 3.1 (Ollama).
Function: Processes the context-aware query and generates a natural language response.
Prompt template: Prompt template to interpolate context into the prompt and add any instruction to make the LLM's response more refined.
Input: Context-aware query from the RAG.
Output: Final response delivered to the user.

Workflow

User Query: A query is made by the user, which initiates the process.
Context Retrieval: The system retrieves relevant data from the database.
PII Redaction: Any sensitive information is redacted from the retrieved data.
RAG Model: The RAG model constructs a context-aware query from the redacted data.
LLM Response: The LLM processes the query and provides the final response to the user.

Example Flow

User asks: "What are the questions involved in general health screening in preemployment health assessment?"
The system retrieves the relevant section from the patient's PDF file.
PII redaction replaces the patient's name and other sensitive details.
The RAG model constructs a query with the anonymised context.
The LLM generates a response.

How to run

Download ollama. Follow instruction to run llama3.1
Install the required packages:

pip install -r requirements.txt

Run the Flask API:

python app.py

Open /ui/index.html in your web browser to interact with the chatbot.

Note

The pdf data is autogenerated by chatGPT4o, with sample empty file based on NSW Preemployment questionnaire.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
docker		docker
guardrails		guardrails
templates		templates
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
Untitled.ipynb		Untitled.ipynb
app.py		app.py
environment.yml		environment.yml
image.png		image.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Model Architecture with LLM (Llama 3.1 from Ollama)

Overview

1. Database

2. PII Redaction (Faker)

3. Context Construction (RAG)

4. LLM

Workflow

Example Flow

How to run

Note

About

Releases

Packages

Languages

hphuocthanh/rag_demo

Folders and files

Latest commit

History

Repository files navigation

RAG Model Architecture with LLM (Llama 3.1 from Ollama)

Overview

1. Database

2. PII Redaction (Faker)

3. Context Construction (RAG)

4. LLM

Workflow

Example Flow

How to run

Note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages