Replies: 1 comment
-
You can probably use our grouping API for this, and group by the URL. Please make sure you configure a payload index on the URL to ensure it remains efficient. https://qdrant.tech/documentation/concepts/search/#search-groups |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Background
I'm creating an open-source search engine that uses Qdrant as the backend vector store for semantic search. I'm adding multiple vectors for each URL that I crawl. The number added is dependent on the size of the page, the larger the page the more vectors. In its payload, each vector contains the URL its associated with.
Current Method
I embed the user's query and use KNN to get the top 100 closest vectors. I use the similarity and the URL from the payload to sort the URLs into priority order and return them to the user.
Discussion
I'm worried about this method's scalability. Ideally, I'd have a single URL with multiple vectors associated with it. This would remove the need for sorting, as it would have been done, ahead of time, during the KNN. Is this something you support already, or are thinking of supporting?
Beta Was this translation helpful? Give feedback.
All reactions