-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add jni interface to use a binary hnsw index with faiss #1747
Add jni interface to use a binary hnsw index with faiss #1747
Conversation
1e329c5
to
859c8a9
Compare
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed this partly
Will do. So far, the error is coming from https://github.com/opensearch-project/k-NN/pull/1747/files#diff-892c5ae9158d35b9f5fa9b67d53d7373ca6d2bf04d1052add99c329d03645409R110 as index description is not passed to the method. |
e91ec4a
to
a0cc58c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. A few things to see what you think
// Setting the ef_search value equal to what was provided during index creation. SearchParametersHNSW has a default | ||
// value of ef_search = 16 which will then be used. | ||
hnswParams.efSearch = hnswReader->hnsw.efSearch; | ||
hnswParams.sel = idSelector.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will have to change for efsearch and nprobs as both will be merged for 2.16. can you please add a TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO might not be enough to catch it. Let me create GH issue instead. #1769
@@ -247,6 +235,19 @@ jlong knn_jni::faiss_wrapper::LoadIndex(knn_jni::JNIUtilInterface * jniUtil, JNI | |||
return (jlong) indexReader; | |||
} | |||
|
|||
jlong knn_jni::faiss_wrapper::LoadBinaryIndex(knn_jni::JNIUtilInterface * jniUtil, JNIEnv * env, jstring indexPathJ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Indexservice have loadIndex
? that way we pick the right implementation of loadIndex and the logic is in the right place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Didn't add it now because wasn't sure if current approach of using index service will be accepted.
Once this PR is merged, will move the load index method into index service in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the refactors and I think it looks good. One thing - can we run cpp unit tests with valgrind on linux to see if we are leaking any memory?
class IndexService { | ||
public: | ||
IndexService(std::unique_ptr<FaissMethods> faissMethods); | ||
virtual void createIndex( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put parameter definitions?
* | ||
* This class helps to mock faiss methods during unit test | ||
*/ | ||
class FaissMethods { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we missing some methods? i.e. load, free, search,...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't moved all methods yet. We will add incrementally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add todo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to go through code and see what is needed to be in this class and add them as todo. We can add methods as needed base. For example, we don't need to add search method here because search is a method of Index struct which can be mocked without this class.
d2e484f
to
6345901
Compare
Signed-off-by: Heemin Kim <heemin@amazon.com>
} | ||
} | ||
|
||
IndexService::IndexService(std::unique_ptr<FaissMethods> faissMethods) : faissMethods(std::move(faissMethods)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The std::move is an interesting choice. But what my thinking is why we are using std::move here rather than just using a pointer or do a call by reference if we are concerned about copying of objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call by reference increase a chance of memory leak by bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this is an object, and you used the unique pointer, I doubt its the case. But if we 100% sure that this will not cause any performance issues and other memory leak I am okay to approve the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This std::move is passing its ownership. If you don't use it, the ownership won't be transferred to IndexService and the memory can be released while IndexService sill needs it.
fe14b21
into
opensearch-project:feature/binary-format
Signed-off-by: Heemin Kim <heemin@amazon.com>
Signed-off-by: Heemin Kim <heemin@amazon.com>
…roject#1778) * Add jni interface to use a binary hnsw index with faiss (opensearch-project#1747) Signed-off-by: Heemin Kim <heemin@amazon.com> * Fix memory leak on test code (opensearch-project#1776) Signed-off-by: Heemin Kim <heemin@amazon.com> --------- Signed-off-by: Heemin Kim <heemin@amazon.com>
…roject#1778) * Add jni interface to use a binary hnsw index with faiss (opensearch-project#1747) Signed-off-by: Heemin Kim <heemin@amazon.com> * Fix memory leak on test code (opensearch-project#1776) Signed-off-by: Heemin Kim <heemin@amazon.com> --------- Signed-off-by: Heemin Kim <heemin@amazon.com>
…roject#1747) Signed-off-by: Heemin Kim <heemin@amazon.com>
…roject#1747) Signed-off-by: Heemin Kim <heemin@amazon.com>
…roject#1778) * Add jni interface to use a binary hnsw index with faiss (opensearch-project#1747) Signed-off-by: Heemin Kim <heemin@amazon.com> * Fix memory leak on test code (opensearch-project#1776) Signed-off-by: Heemin Kim <heemin@amazon.com> --------- Signed-off-by: Heemin Kim <heemin@amazon.com>
Description
Add jni interface to use a binary hnsw index with faiss
SearchParameter is not supported with binary hnsw index in faiss library yet. Therefore, multi vector, pre filtering, and ef-search parameter passing won't work with binary index. Support of SearchParameter in binary index with faiss will be handled either by a PR in faiss repo or faiss custom patch in a separate PR.
Issues Resolved
N/A
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.