Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add jni interface to use a binary hnsw index with faiss #1747

Merged
merged 1 commit into from
Jun 25, 2024

Conversation

heemin32
Copy link
Collaborator

@heemin32 heemin32 commented Jun 11, 2024

Description

Add jni interface to use a binary hnsw index with faiss

SearchParameter is not supported with binary hnsw index in faiss library yet. Therefore, multi vector, pre filtering, and ef-search parameter passing won't work with binary index. Support of SearchParameter in binary index with faiss will be handled either by a PR in faiss repo or faiss custom patch in a separate PR.

Issues Resolved

N/A

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

CHANGELOG.md Outdated Show resolved Hide resolved
jni/src/faiss_wrapper.cpp Outdated Show resolved Hide resolved
src/main/java/org/opensearch/knn/jni/FaissService.java Outdated Show resolved Hide resolved
@navneet1v

This comment was marked as resolved.

Copy link
Contributor

@shatejas shatejas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed this partly

src/main/java/org/opensearch/knn/jni/JNIService.java Outdated Show resolved Hide resolved
src/test/java/org/opensearch/knn/jni/JNIServiceTests.java Outdated Show resolved Hide resolved
src/test/java/org/opensearch/knn/jni/JNIServiceTests.java Outdated Show resolved Hide resolved
jni/src/faiss_wrapper.cpp Outdated Show resolved Hide resolved
jni/src/faiss_wrapper.cpp Outdated Show resolved Hide resolved
jni/src/faiss_wrapper.cpp Show resolved Hide resolved
jni/src/jni_util.cpp Show resolved Hide resolved
@heemin32
Copy link
Collaborator Author

@heemin32 lets fix the GH actions too.

Will do. So far, the error is coming from https://github.com/opensearch-project/k-NN/pull/1747/files#diff-892c5ae9158d35b9f5fa9b67d53d7373ca6d2bf04d1052add99c329d03645409R110 as index description is not passed to the method.

@heemin32 heemin32 force-pushed the binary branch 10 times, most recently from e91ec4a to a0cc58c Compare June 14, 2024 17:03
@heemin32 heemin32 requested a review from navneet1v June 24, 2024 05:25
Copy link
Contributor

@shatejas shatejas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. A few things to see what you think

jni/include/faiss_methods.h Show resolved Hide resolved
Comment on lines +453 to +456
// Setting the ef_search value equal to what was provided during index creation. SearchParametersHNSW has a default
// value of ef_search = 16 which will then be used.
hnswParams.efSearch = hnswReader->hnsw.efSearch;
hnswParams.sel = idSelector.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have to change for efsearch and nprobs as both will be merged for 2.16. can you please add a TODO

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO might not be enough to catch it. Let me create GH issue instead. #1769

jni/tests/faiss_index_service_test.cpp Show resolved Hide resolved
@@ -247,6 +235,19 @@ jlong knn_jni::faiss_wrapper::LoadIndex(knn_jni::JNIUtilInterface * jniUtil, JNI
return (jlong) indexReader;
}

jlong knn_jni::faiss_wrapper::LoadBinaryIndex(knn_jni::JNIUtilInterface * jniUtil, JNIEnv * env, jstring indexPathJ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Indexservice have loadIndex? that way we pick the right implementation of loadIndex and the logic is in the right place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Didn't add it now because wasn't sure if current approach of using index service will be accepted.
Once this PR is merged, will move the load index method into index service in a separate PR.

src/main/java/org/opensearch/knn/index/IndexUtil.java Outdated Show resolved Hide resolved
src/main/java/org/opensearch/knn/index/KNNIndexShard.java Outdated Show resolved Hide resolved
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the refactors and I think it looks good. One thing - can we run cpp unit tests with valgrind on linux to see if we are leaking any memory?

jni/include/commons.h Show resolved Hide resolved
class IndexService {
public:
IndexService(std::unique_ptr<FaissMethods> faissMethods);
virtual void createIndex(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put parameter definitions?

jni/include/faiss_index_service.h Show resolved Hide resolved
*
* This class helps to mock faiss methods during unit test
*/
class FaissMethods {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing some methods? i.e. load, free, search,...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't moved all methods yet. We will add incrementally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add todo

Copy link
Collaborator Author

@heemin32 heemin32 Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to go through code and see what is needed to be in this class and add them as todo. We can add methods as needed base. For example, we don't need to add search method here because search is a method of Index struct which can be mocked without this class.

@heemin32 heemin32 force-pushed the binary branch 2 times, most recently from d2e484f to 6345901 Compare June 24, 2024 19:49
@heemin32 heemin32 requested a review from shatejas June 24, 2024 22:11
Signed-off-by: Heemin Kim <heemin@amazon.com>
}
}

IndexService::IndexService(std::unique_ptr<FaissMethods> faissMethods) : faissMethods(std::move(faissMethods)) {}
Copy link
Collaborator

@navneet1v navneet1v Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The std::move is an interesting choice. But what my thinking is why we are using std::move here rather than just using a pointer or do a call by reference if we are concerned about copying of objects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call by reference increase a chance of memory leak by bug.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this is an object, and you used the unique pointer, I doubt its the case. But if we 100% sure that this will not cause any performance issues and other memory leak I am okay to approve the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This std::move is passing its ownership. If you don't use it, the ownership won't be transferred to IndexService and the memory can be released while IndexService sill needs it.

@heemin32 heemin32 requested a review from navneet1v June 25, 2024 15:49
@heemin32 heemin32 merged commit fe14b21 into opensearch-project:feature/binary-format Jun 25, 2024
47 of 53 checks passed
@heemin32 heemin32 deleted the binary branch June 26, 2024 03:47
heemin32 added a commit that referenced this pull request Jul 2, 2024
Signed-off-by: Heemin Kim <heemin@amazon.com>
heemin32 added a commit that referenced this pull request Jul 2, 2024
Signed-off-by: Heemin Kim <heemin@amazon.com>
heemin32 added a commit that referenced this pull request Jul 3, 2024
* Add jni interface to use a binary hnsw index with faiss (#1747)

Signed-off-by: Heemin Kim <heemin@amazon.com>

* Fix memory leak on test code (#1776)

Signed-off-by: Heemin Kim <heemin@amazon.com>

---------

Signed-off-by: Heemin Kim <heemin@amazon.com>
luyuncheng pushed a commit to luyuncheng/k-NN-1 that referenced this pull request Jul 7, 2024
…roject#1778)

* Add jni interface to use a binary hnsw index with faiss (opensearch-project#1747)

Signed-off-by: Heemin Kim <heemin@amazon.com>

* Fix memory leak on test code (opensearch-project#1776)

Signed-off-by: Heemin Kim <heemin@amazon.com>

---------

Signed-off-by: Heemin Kim <heemin@amazon.com>
luyuncheng pushed a commit to luyuncheng/k-NN-1 that referenced this pull request Jul 7, 2024
…roject#1778)

* Add jni interface to use a binary hnsw index with faiss (opensearch-project#1747)

Signed-off-by: Heemin Kim <heemin@amazon.com>

* Fix memory leak on test code (opensearch-project#1776)

Signed-off-by: Heemin Kim <heemin@amazon.com>

---------

Signed-off-by: Heemin Kim <heemin@amazon.com>
junqiu-lei pushed a commit to junqiu-lei/k-NN that referenced this pull request Jul 8, 2024
junqiu-lei pushed a commit to junqiu-lei/k-NN that referenced this pull request Jul 9, 2024
heemin32 added a commit to heemin32/k-NN that referenced this pull request Jul 11, 2024
…roject#1778)

* Add jni interface to use a binary hnsw index with faiss (opensearch-project#1747)

Signed-off-by: Heemin Kim <heemin@amazon.com>

* Fix memory leak on test code (opensearch-project#1776)

Signed-off-by: Heemin Kim <heemin@amazon.com>

---------

Signed-off-by: Heemin Kim <heemin@amazon.com>
heemin32 added a commit that referenced this pull request Jul 11, 2024
* Add jni interface to use a binary hnsw index with faiss (#1747)



* Fix memory leak on test code (#1776)



---------

Signed-off-by: Heemin Kim <heemin@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants