Repository Overview
This repository demonstrates the integration of Chroma DB, a vector database, with embedding models to develop a robust Retrieval Augmented Generation (RAG) system.
Embedding Model Options
- Ollama Embedding Model:
- Hugging Face Text Embedder:
- OpenAI Embedding Model:
Re-ranker Integration (http, grpc)
To enhance the accuracy of RAG, we can incorporate HuggingFace Re-rankers models. These models evaluate the similarity between a query and query results retreived from vectordb, Re-Ranker rank the results by index ensuring that retrieved information is relevant and contextually accurate.
Example:
query := "What is Deep Learning?"
retrievedResults := []string{"Tomatos are fruits...", "Deep Learning is not...", "Deep learning is..."}
Response: [{"index":2,"score":0.9987814},{"index":1,"score":0.022949383},{"index":0,"score":0.000076250595}]
This repository demonstrates how to combine embedding and reranking to develop a RAG system.
-
Set Up Vector Database:
- Use Chroma DB to store your document embeddings.
- Support for Ollama embedding models and Hugging Face Tei.
-
Preprocess Documents:
- Split your documents into manageable chunks.
- Generate embeddings for each chunk using an embedding model such as "nomic-embed-text" from Ollama.
-
Store Embeddings:
- Store the chunks and their corresponding embeddings in the Chroma DB vector database.
-
Query Processing:
- When you have a query:
- Generate an embedding for the query.
- Perform a similarity search within the vector database to identify the most relevant chunks based on their embeddings.
- Retrieve these chunks as context for your query.
- Rerank the results using Hugging Face Reranker
- When you have a query:
-
Integrate with LLM Provider:
- Supported LLM Providers
- Ollama
- OpenAi
- Supported LLM Providers
-
Create Prompt Template:
- Design a prompt template that incorporates both the original query and the context retrieved from the vector database.
-
Process with LLM:
- Send the augmented prompt, including the query and reranked context, to the Large Language Model (LLM) for processing and generation of responses.
This allows to enhance language processing tasks by leveraging the power of vector databases and advanced embedding models.
<|user|> what is mirostat_tau?</s>:-
Based on the provided content, I can answer your query.
**Query Result:** Mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
**Document Content:**
mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
float
mirostat_tau 5.0
**Additional Information on this Topic:**
Here are three main points related to Mirostat_tau:
1. **Coherence vs Diversity:** Mirostat_tau controls the balance between coherence and diversity of the output, which means it determines how focused or creative the generated text will be.
2. **Lower Values Mean More Focus:** A lower value for mirostat_tau results in more focused and coherent text, while a higher value allows for more diverse and potentially less coherent output.
3. **Default Value:** The default value for Mirostat_tau is 5.0, which means that if no specific value is provided, the model will generate text with a balance between coherence and diversity.
Please note that these points are based solely on the provided content and do not go beyond it.%
- Go (>=1.22.0)
- Docker
- Docker Compose
- Clone the Repository
git clone https://github.com/yourusername/chroma-db.git
cd chroma-db
- Install Go Packages
- Build the Go Project
go build -o chroma-db cmd/main.go
- Set Up Docker Containers
Ensure Docker and Docker Compose are installed. Use the docker-compose.yaml
to set up the Chroma DB service.
docker-compose up -d
./chroma-db
Usage
-load
Load and embed the data in vectordb
Provide the path to file Eg: "test/model_params.txt"
-query
Query the embedded data and rerank the results
Provide the query Eg:"what is the difference between mirostat_tau and mirostat_eta?"
-
cmd/:
- main.go: Entry point for running the Chroma DB.
- chat/:
- ollama_chat.go: Contains the logic for interacting with the Ollama chat model.
-
internal/constants/:
- constants.go: Houses all the necessary constants used across the project.
-
docker-compose.yaml: Docker Compose configuration file for setting up the Chroma DB service.
Adjust configuration values in internal/constants/constants.go
to fit your needs. This includes settings like:
Chroma DB URL, Tenant name, Database & Namespace. Ollama model type and URL.
<|system|> {{ .SystemPrompt }}</s>
<|content|> {{ .Content }}</s>
<|user|> {{ .Prompt }}</s>
Start the VectorDb with the following command:
docker compose up
Execute chat-related operations:
go run ./cmd/main.go
Default configuration values are provided in internal/constants/constants.go
and can be adjusted as per your needs. Some of these include:
ChromaUrl
,TenantName
,Database
,Namespace
OllamaModel
andOllamaUrl
This project is licensed under the BSD 3-Clause License - see the LICENSE file for details.
For any issues or contributions, please open an issue or submit a pull request on GitHub.