Hybrid Search With Postgres & PgVector For RAG using Groq

This project demonstrates how to implement a hybrid search engine for Retrieval-Augmented Generation (RAG) using Postgres with PgVector. It showcases the use of asynchronous streaming with Groq's function calling capabilities in a FastAPI application.

Description

This project utilizes Postgres as a vector database to create a hybrid search engine that combines vector search and text search capabilities. It leverages PgVector for storing and querying vector embeddings, and Groq's Large Language Model for function calling to retrieve information from the database through hybrid search.

Technologies Used

Features

FastAPI-based implementation
Hybrid search combining vector and text search capabilities
Function calling for executing commands and retrieving information
Chat interface for real-time communication
Product search and recommendation system

Getting Started

Prerequisites

Python 3.8 or higher
PostgreSQL
PgVector

Dependencies

Refer to requirements.txt for a complete list of dependencies. Key dependencies include:

FastAPI
OpenAI API (for generating embeddings)
Groq (for Large Language Model)
SQLAlchemy
PgVector

Installation

Clone the repository
Create and activate a virtual environment
Install dependencies:
```
pip install -r requirements.txt
```
Install PostgreSQL:
- For Ubuntu: sudo apt-get install postgresql
- For macOS: brew install postgresql
- For Windows: Download and install from the official PostgreSQL website

Create a database named rag_example_schema:

psql -U postgres
CREATE DATABASE rag_example_schema;

Enable the vector extension:

\c rag_example_schema
CREATE EXTENSION IF NOT EXISTS vector;
\q

Set up environment variables: Create a .env file in the root directory and add the following variables:

OPENAI_API_KEY=your_openai_api_key
GROQ_API_KEY=your_groq_api_key
DATABASE_NAME=rag_example_schema
DATABASE_USER=your_database_user
DATABASE_PASSWORD=your_database_password
DATABASE_URL=localhost
DATABASE_PORT=5432

Temporarily disable the index in models/product.py: Comment out the following lines:

# index_ada002 = Index(
#     "hnsw_index_for_innerproduct_product_embedding_ada002",
#     Product.embedding,
#     postgresql_using="hnsw",
#     postgresql_with={"m": 16, "ef_construction": 64},
#     postgresql_ops={"embedding_ada002": "vector_ip_ops"},
# )

Load initial data:
```
python scripts/load_data.py
```

Re-enable the index in models/product.py: Uncomment the lines you commented in step 8:

index_ada002 = Index(
    "hnsw_index_for_innerproduct_product_embedding_ada002",
    Product.embedding,
    postgresql_using="hnsw",
    postgresql_with={"m": 16, "ef_construction": 64},
    postgresql_ops={"embedding_ada002": "vector_ip_ops"},
)

Run the application:
```
uvicorn main:app --reload
```

Your application should now be running at http://localhost:8000.

Contributing

Contributions are welcome! Please open an issue or submit a pull request with your changes.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
api		api
config		config
models		models
notebooks		notebooks
scripts		scripts
services		services
templates		templates
.env.development		.env.development
.gitignore		.gitignore
README.md		README.md
demo.png		demo.png
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hybrid Search With Postgres & PgVector For RAG using Groq

Description

Technologies Used

Features

Getting Started

Prerequisites

Dependencies

Installation

Contributing

About

Releases

Packages

Languages

Syed007Hassan/Hybrid-Search-For-Rag

Folders and files

Latest commit

History

Repository files navigation

Hybrid Search With Postgres & PgVector For RAG using Groq

Description

Technologies Used

Features

Getting Started

Prerequisites

Dependencies

Installation

Contributing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages