This repository contains a Rust-based system for managing vector embeddings and querying them using a PostgreSQL-backed vector database. The system is designed to handle embedding generation, storage, and querying.
The system is composed of several modules that handle different aspects of the embedding and querying process:
- Commands: Handles command-line arguments and subcommands.
- Config: Manages configuration settings for embedding requests and database connections.
- Constants: Provides constant values used throughout the application.
- Embedding: Contains logic for generating embeddings and persisting them to the database.
- VectorDB: Handles interactions with the PostgreSQL database for storing and querying vector embeddings.
- Embedding Generation: Generate vector embeddings from input data.
- Database Persistence: Store embeddings in a PostgreSQL database.
- Querying: Query the database to find nearest neighbors based on vector embeddings.
src/
├── app/
│ ├── commands.rs
│ ├── config.rs
│ └── constants.rs
├── embedding/
│ ├── run_embedding.rs
│ └── vector_embedding.rs
├── main.rs
├── tests/
│ ├── setup_docker.rs
│ ├── test_pgclient.rs
│ ├── test_query_vector.rs
│ ├── test_run_embedding.rs
│ └── test_vector_embedding.rs
├── vectordb/
│ ├── pg_vector.rs
│ └── query_vector.rs
├── lib.rs
├── vectordb/mod.rs
├── tests/mod.rs
├── embedding/mod.rs
└── app/mod.rs
- Rust (latest stable version)
- PostgreSQL Vector Db.
- Docker (for running tests)
- Active Ollama Service with
nomic-embed-text
or similar model.
-
Clone the repository:
git clone https://github.com/rupeshtr78/pg-vector-embed-rust.git cd pg-vector-embed-rust
-
Install dependencies:
cargo build
-
Start the PostgreSQL vector database (if not already running).
-
Ollama service should be running with the specified model.
-
Run the application:
cargo run
The application supports various commands and subcommands. Use the --help
flag to see available options:
cargo run -- --help
cargo run -- write --input "dog sound is called bark" --input "cat sounds is called purr" --model "nomic-embed-text" --table "from_rust2" --dim 768 --log-level "debug"
cargo run -- query --input "who is barking" --model "nomic-embed-text" --table "from_rust2"
Configuration settings for embedding requests and database connections are managed in src/app/config.rs
. You can modify these settings as needed.
- Generate Embeddings: Use the
run_embedding
function to generate embeddings and persist them to the database. - Query Embeddings: Use the
run_query
function to query the database for nearest neighbors based on vector embeddings.
The test suite requires postgres vectordb and ollama with embedding model to be running in the correct configuration.
cargo test
Contributions are welcome! Please read the CONTRIBUTING.md file for details on how to contribute to this project.
This project is licensed under the MIT License - see the LICENSE file for details.
Feel free to customize this README further based on your specific needs and project details.