Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: postgres vector store #231

Merged
merged 10 commits into from
Jan 27, 2025

Conversation

carlos-verdes
Copy link
Contributor

@carlos-verdes carlos-verdes commented Jan 22, 2025

Resolves #5 based on PR #157

@milancermak mentioned not having time to finish his PR so I created this one based on his implementation.

Current code has just the library code but I added an integration test based on rig-qdrant module, it starts a Docker container with Postgres + PgVector and it simulate calls to OpenAI using a mocked API.

Difference from previous PR:

  • removed static lifetime
  • the documents are stored as json in the database so you can use any type that implements Serializable, Deserializable
  • support for more than one embedding per document (this also is not supported in rig-qdrant that only takes first vector), in the case more than one result hit the same document only nearest one is returned.
  • use sqlx instead of tokio-postgres
  • creation of database is outside the code and I provide example of setup in integration test (using sqlx migrations)
  • you can use any distance filter supported by PgVector (is not hardcoded to cosine)
  • Created example that can be launch using make run (from rig-postgres folder), load environmental variables from .env file and handles documents with more than one embedding. It also runs migrations automatically on the database to make sure Postgres tables are ready for the test.

Pending:

  • add documentation

@carlos-verdes carlos-verdes marked this pull request as draft January 22, 2025 17:56
@carlos-verdes
Copy link
Contributor Author

@0xMochan @cvauclair do you mind taking a look into this?

@cvauclair
Copy link
Contributor

Hey @carlos-verdes just saw your ping, wasn't sure if the PR was ready for review since it's still a draft but I'll take a look later today!

Thanks a lot for the PR, let's get this postgres integration done 🦾

@cvauclair cvauclair self-requested a review January 23, 2025 18:30
@carlos-verdes
Copy link
Contributor Author

It's in draft because documentation is not finished, but the code + example + integration test is ready.

I can add later an example using Streams if that would be useful!

Copy link
Contributor

@cvauclair cvauclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid! I added a couple suggestions/comments, please take a look when you have a moment.

Cheers!

rig-postgres/README.md Show resolved Hide resolved
rig-postgres/src/lib.rs Outdated Show resolved Hide resolved
rig-postgres/Cargo.toml Outdated Show resolved Hide resolved
rig-postgres/src/lib.rs Outdated Show resolved Hide resolved
rig-postgres/Cargo.toml Show resolved Hide resolved
@carlos-verdes
Copy link
Contributor Author

@cvauclair thanks for the review, I updated code based on your comments, take a look and let me know if you want to squash the commits.

@carlos-verdes carlos-verdes marked this pull request as ready for review January 24, 2025 18:27
Copy link
Contributor

@cvauclair cvauclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work, thanks for the contribution @carlos-verdes !

@cvauclair cvauclair merged commit 9a12942 into 0xPlaygrounds:main Jan 27, 2025
4 checks passed
@github-actions github-actions bot mentioned this pull request Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Add support for PostgreSQL vector store
2 participants