Retrieval Augmented Generation with VectorDb, Hugging Face Embededders and Re-rankers

Repository Overview

This repository demonstrates the integration of Chroma DB, a vector database, with embedding models to develop a robust Retrieval Augmented Generation (RAG) system.

Embedding Model Options

Ollama Embedding Model:
Hugging Face Text Embedder:
OpenAI Embedding Model:

Re-ranker Integration (http, grpc)

To enhance the accuracy of RAG, we can incorporate HuggingFace Re-rankers models. These models evaluate the similarity between a query and query results retreived from vectordb, Re-Ranker rank the results by index ensuring that retrieved information is relevant and contextually accurate.

Example:
query := "What is Deep Learning?"
retrievedResults := []string{"Tomatos are fruits...", "Deep Learning is not...", "Deep learning is..."}
Response: [{"index":2,"score":0.9987814},{"index":1,"score":0.022949383},{"index":0,"score":0.000076250595}]

This repository demonstrates how to combine embedding and reranking to develop a RAG system.

Steps followed to Implement this RAG System

Set Up Vector Database:
- Use Chroma DB to store your document embeddings.
- Support for Ollama embedding models and Hugging Face Tei.
Preprocess Documents:
- Split your documents into manageable chunks.
- Generate embeddings for each chunk using an embedding model such as "nomic-embed-text" from Ollama.
Store Embeddings:
- Store the chunks and their corresponding embeddings in the Chroma DB vector database.
Query Processing:
- When you have a query:
  - Generate an embedding for the query.
  - Perform a similarity search within the vector database to identify the most relevant chunks based on their embeddings.
  - Retrieve these chunks as context for your query.
  - Rerank the results using Hugging Face Reranker
Integrate with LLM Provider:
- Supported LLM Providers
  - Ollama
  - OpenAi
Create Prompt Template:
- Design a prompt template that incorporates both the original query and the context retrieved from the vector database.
Process with LLM:
- Send the augmented prompt, including the query and reranked context, to the Large Language Model (LLM) for processing and generation of responses.

This allows to enhance language processing tasks by leveraging the power of vector databases and advanced embedding models.

Sample Results

<|user|> what is mirostat_tau?</s>:-
Based on the provided content, I can answer your query.

**Query Result:** Mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)

**Document Content:**

mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
float
mirostat_tau 5.0

**Additional Information on this Topic:**

Here are three main points related to Mirostat_tau:

1. **Coherence vs Diversity:** Mirostat_tau controls the balance between coherence and diversity of the output, which means it determines how focused or creative the generated text will be.
2. **Lower Values Mean More Focus:** A lower value for mirostat_tau results in more focused and coherent text, while a higher value allows for more diverse and potentially less coherent output.
3. **Default Value:** The default value for Mirostat_tau is 5.0, which means that if no specific value is provided, the model will generate text with a balance between coherence and diversity.

Please note that these points are based solely on the provided content and do not go beyond it.%

Getting Started

Prerequisites

Go (>=1.22.0)
Docker
Docker Compose

Installation

Clone the Repository

git clone https://github.com/yourusername/chroma-db.git
cd chroma-db

Install Go Packages
Build the Go Project

go build -o chroma-db cmd/main.go

Set Up Docker Containers

Ensure Docker and Docker Compose are installed. Use the docker-compose.yaml to set up the Chroma DB service.

docker-compose up -d

Running the Project

./chroma-db
Usage 
  -load
        Load and embed the data in vectordb
        Provide the path to file Eg: "test/model_params.txt"
  -query
        Query the embedded data and rerank the results
        Provide the query Eg:"what is the difference between mirostat_tau and mirostat_eta?"

Project Structure

cmd/:
- main.go: Entry point for running the Chroma DB.
- chat/:
  - ollama_chat.go: Contains the logic for interacting with the Ollama chat model.
internal/constants/:
- constants.go: Houses all the necessary constants used across the project.
docker-compose.yaml: Docker Compose configuration file for setting up the Chroma DB service.

Configuration

Adjust configuration values in internal/constants/constants.go to fit your needs. This includes settings like:

Chroma DB URL, Tenant name, Database & Namespace. Ollama model type and URL.

Prompt Go Template

  <|system|> {{ .SystemPrompt }}</s>
  <|content|> {{ .Content }}</s>
  <|user|> {{ .Prompt }}</s>

Running VectorDB

Start the VectorDb with the following command:

docker compose up

Chat with Ollama

Execute chat-related operations:

go run ./cmd/main.go

Configuration

Default configuration values are provided in internal/constants/constants.go and can be adjusted as per your needs. Some of these include:

ChromaUrl, TenantName, Database, Namespace
OllamaModel and OllamaUrl

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.

Acknowledgments

For any issues or contributions, please open an issue or submit a pull request on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
app/chat		app/chat
cmd		cmd
internal		internal
pkg		pkg
scripts		scripts
test		test
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
TODO.md		TODO.md
docker-compose-grpc.yaml		docker-compose-grpc.yaml
docker-compose-hf.yaml		docker-compose-hf.yaml
docker-compose.yaml		docker-compose.yaml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retrieval Augmented Generation with VectorDb, Hugging Face Embededders and Re-rankers

Steps followed to Implement this RAG System

Sample Results

Getting Started

Prerequisites

Installation

Running the Project

Project Structure

Configuration

Prompt Go Template

Running VectorDB

Chat with Ollama

Configuration

License

Acknowledgments

About

Releases

Packages

Languages

License

rupeshtr78/chroma-db-rag

Folders and files

Latest commit

History

Repository files navigation

Retrieval Augmented Generation with VectorDb, Hugging Face Embededders and Re-rankers

Steps followed to Implement this RAG System

Sample Results

Getting Started

Prerequisites

Installation

Running the Project

Project Structure

Configuration

Prompt Go Template

Running VectorDB

Chat with Ollama

Configuration

License

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages