Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector Search and Semantic Caching #417

Merged
merged 39 commits into from
Dec 5, 2023
Merged

Vector Search and Semantic Caching #417

merged 39 commits into from
Dec 5, 2023

Conversation

slorello89
Copy link
Member

Introduces Vector Search and Semantic caching to Redis OM .NET.

One breaking change - replace string[] with object[] in some key places (e.g. Execute/ExecuteAsync) as the byte arrays needed for VectorSearch need to be passed in raw. Should be transparent to anyone using the higher-level APIs within Redis OM, but anyone using those raw commands might need to make some adjustments. See README for details as to how to use the new API.

@slorello89
Copy link
Member Author

Some broken tests at first (need to obtain keys for the integration tests with OpenAI/HuggingFace/Azure)

@slorello89
Copy link
Member Author

FYI @Spartee, @tylerhutcherson, & @banker

Copy link

@bsbodden bsbodden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L-very-GTM!

Copy link

@tylerhutcherson tylerhutcherson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heck yeah. This is awesome Steve. I left a few readme suggestion/ideas, mainly focused on clarity and flow. Where will this be represented in the docs?

README.md Outdated

A `Vector<T>` is a representation of an object that can be transformed into a vector by a Vectorizer.

A `VectorizerAttribute` is the abstract class you use to decorate your Vector fields, it is responsible for defining the logic to convert your Vectors into Embeddings. In the package `Redis.OM.Vectorizers` we provide vectorizers for HuggingFace, OpenAI, and AzureOpenAI to allow you to easily integrate them into your workflows.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the phrase "convert your Vectors into Embeddings" is a bit misleading as those two terms are relatively interchangeable. I think we're essentially talking about the definition of the various vector field attributes like distance metric, data type, dims, etc? Some of those subsumed by the choice of vectorizer for sure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed the verbiage a bit - hopefully this is better?

README.md Outdated
[RedisIdField]
public string Id { get; set; }

[Indexed(DistanceMetric = DistanceMetric.COSINE)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also show how a few other vector field attributes like index type (HNSW vs FLAT) and related args are set here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of other parameters for the index definition, and explained it a bit better in the modeling section.


With Redis OM, the embeddings can be completely transparent to you, they are generated and bound to the `Vector<T>` when you query/insert your vectors. If however you needed your embedding after the insertion/Query, they are available at `Vector<T>.Embedding`, and be queried either as the raw bytes, as an array of doubles or as an array of floats (depending on your vectorizer).

#### Configuration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add other vector field attribute level configuration details here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added details about index definition in the modeling section as that's more or less where it belongs (the configuration section is talking about configuring the vectorizers)

README.md Show resolved Hide resolved
README.md Outdated
With the vector defined in our model, all we need to do is create Vectors of the generic type, and insert them with our model. Using our `RedisCollection`, you can do this by simply using `Insert`:

```cs
var query = new OpenAIQuery

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming this query makes sense given the implied caching use case here. Maybe spell it out a bit so it's clear why we are inserting a "query" object into your vector database?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query is confusing in this context, is OpenAICompletionResult & completionResult better? (It's something that's not query that you might actually do with these embeddings lol).

@slorello89 slorello89 linked an issue Dec 5, 2023 that may be closed by this pull request
@slorello89 slorello89 merged commit cf41ed7 into main Dec 5, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhancement: Add support for "vector similarity" search
3 participants