Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Integrates FAISS iterative builds with NativeEngines990KnnVectorsForm…
…at (opensearch-project#1950) * Iterative Vector Insertion (opensearch-project#1840) * Rebased with new version of k-NN Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Optimized faiss insertion Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Optimized threadCount logic Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed IDEA files Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed unnecessary cmake file Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added comments to new functions Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed createIndex and fixed test cases that use it Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed unused code Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Explained zero initialization for vector transfer Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added locale Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Spotless Apply Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Account for zero documents in finished batch Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed where we check for zero docs Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed tip for return Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Use unique pointers to make sure resources are released on exception Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Moved createIndex to testUtils Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed memory management so that the underlying index is not deleted after initialized Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Created new KNNIndexBuilder graph to make index building more modular Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Streamlined logic in KNNIndexBuilder. Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Cleaned up unnecessary code in KNN80DocValuesConsumer Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed memory management process Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added note about index initialization in faiss_index_service Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Accounted for case where the exception happens after the indexWriter is released. Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Delete jni/src/.idea/modules.xml Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Delete jni/src/.idea/vcs.xml Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Delete jni/src/.idea/workspace.xml Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Spotless apply and free iterative index on exception Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Undid hack for checking first document metrics Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed print statements Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Free Vector Transfer on batch ingestion Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Undid free Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed check for transfer ready Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Don't crash when zero vectors inserted? Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Reverted to old insertion process? Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Spotless apply Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added back createOutput Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed prior createOutput Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Test remaking vectorTransfer Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Test restructuring of insertion Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed case where vector address is immediately discarded Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Spotless apply Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Split Index Builder into multiple classes Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed descriptions of functions in faiss_index_service Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added back copyright files Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed unused builder names Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Modified tests to work with new insertion methods Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Track index insertions Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Tracked insertions for binary indices Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added back insertIds Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added check for freeVectorData to see if it works with an already deleted address Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Cleaned up logs and comments in KNNIndexBuilder Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Restructured the logic for KNNIndexBuilder Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed package name of KNNIndexBuilder Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed all package names and deleted unnecessary headers Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed for loop Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed createIndex methods for faiss index service Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed package to fit naming conventions Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed name of index builder Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Spotless apply Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added comments to NativeIndexBuilder and restructured Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added deletion for memoryAddress Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Spotless apply Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed naming of classes to Writer and changed package name to fit conventions Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed NativeIndexInfo and NativeVectorInfo to follow builder pattern Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added feature to changelog Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added class descriptions to each NativeIndexWriter Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed name to getBytesPerVector Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added == false instead of ! for readability Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed changelog Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed naming in docvaluesconsumer Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * SpotlessApply Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Made it so that we don't reuse testValues and removed a foot gun Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed another foot gun in getIndexInfo Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed javadoc Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added deletion on exception cases Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed unnecessary delete (NativeIndexWriter will handle deletion of vectors on exception) Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added correct logger and getWriter method to NativeIndexWriter Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Ensured memory safety on JNI layer so that Java doesn't have to wrap everything in a try catch loop. Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Refactored NativeIndexWriter and added comments to FaissService Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed free in the JNIExport since index will always be freed in writeIndex. Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Changed getVectorTransfer back to accept VectorDataType Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Reverted free since not guaranteed to be IDMap. Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added all processes in addKNNBinaryField to NativeIndexWriter.createKNNIndex Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed javadoc Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Applied spotless Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Added back writeFooter Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed threadCount fron writeIndex Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed redundancies in KNN80DocValuesConsumer Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed serializationMode Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed changelog Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed changelog Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed double free test as we don't have to worry about that anymore Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Accounted for HNSWSQ in index service Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed delete in catch Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed faiss tests to work with writeIndex Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> --------- Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Index Initialization Alloc Method (opensearch-project#1933) * Added methods for allocating memory before inserting vectors to a faiss index Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed logic that gets type of index Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed print statement Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed unnecessary iostream Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed flat index Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed flat index case Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed naming Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Properly allocate HNSWSQ storage Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed print statements Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Fixed changelog Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Removed unnecessary lib Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Made alloc adaptive to different code sizes Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> --------- Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> * Integrates FAISS iterative builds with NativeEngines990KnnVectorsFormat Changes include reusing the same vector buffer in the JNI layer Signed-off-by: Tejas Shah <shatejas@amazon.com> --------- Signed-off-by: Andrew Klepchick <aklepchi@amazon.com> Signed-off-by: Tejas Shah <shatejas@amazon.com> Co-authored-by: Andrew Klepchick <aklepchi@amazon.com>
- Loading branch information