-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Memory Evaluation For different algorithm #1139
Conversation
bfc211b
to
b3e1fff
Compare
Signed-off-by: luyuncheng <luyuncheng@bytedance.com>
Update and rename faiss_memory_test.cpp to memory_test.cpp Signed-off-by: luyuncheng <luyuncheng@bytedance.com>
897fa07
to
6c43fa9
Compare
Update CMake file Signed-off-by: luyuncheng <luyuncheng@bytedance.com>
Signed-off-by: luyuncheng <luyuncheng@bytedance.com>
@luyuncheng can you add details around what is Index RES and Query RES? |
using ::testing::Return; | ||
#define GTEST_COUT std::cerr << "[ ] [ INFO ]" | ||
|
||
TEST(FaissHNSWIndexMemoryTest, BasicAssertions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add on all the tests what we are testing and what is the expectation as comments on top of all test function.
Also, I want to understand little bit here in terms of what are the failure scenario for these tests. may be you explained it in older comments, if yes can you point me there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test just evaluate the memory only with EngineWrapper, not unit test.
When we want to introduce a new algorithm or engine, we prefer to evaluate the performance, memory, disk size. in benchmark tests, we can evaluate the performance as a single node.
But we can not evaluate a engine/algorithm takes how much memory, because in benchmark jvm make it hard to evaluate the memory only in jni layer.
so i added these code and want to evaluate different algorithm/engine in different param, at index time, query time memory usage, and time usage.
@navneet1v |
In general, I really like having the ability to use the jni_wrapper in order to test our code with real data sets (not just random data). This has a lot of potential to help us debug memory problems as well as performance problems. That being said, I think that the memory monitoring should be done outside of the tests. Adding memory monitoring inside the test may make them difficult to work across platforms. I see the tests themselves more like JNI integration tests or end to end tests or microbenchmarks. We should remove all calls to get specific memory information from faiss and change the name from memory_test to integ_test or e2e_test or microbenchmarks. Instead, to check memory, I think that they can be run with an external monitor. For instance, I believe you used gperftools at some point. |
@@ -150,6 +150,16 @@ namespace test_util { | |||
|
|||
float RandomFloat(float min, float max); | |||
|
|||
// Read vector file formats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment about the format the data is expected to be in
// Read vector file formats | ||
void load_data(char* filename, float*& data, unsigned& num, unsigned& dim); | ||
|
||
// asign data into vector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: asign -> assign. Also, can we add more detail about how this function should be used in the comment?
@luyuncheng are you still working on this PR? |
Description
When we want to introduce a new algorithm or engine, we prefer to evaluate the performance, memory, disk size. in benchmark tests, we can evaluate the performance as a single node.
But we can not evaluate a engine/algorithm takes how much memory, because in benchmark jvm make it hard to evaluate the memory only in jni layer.
in #946 we try to assess memory usage with different algorithm, and when using benchmark/intergration tests, java heap memory and other memory usage make it hard to evaluate algorithm real memory usage.
so i write a memory tests, and only use
faiss_wrapper/nmslib_wrapper
. it can evaluate memory usage, file size.i use http://corpus-texmex.irisa.fr/ vector file format. and add
test_util::load_data
to read sift.fvecs with SIFT1M datasets.i do some tests like following report:
SIFT1M
GIST960
Usage:
go to http://corpus-texmex.irisa.fr/, and download SIFT1M dataset, and unzip into a directory like 'dataset/sift/sift_base.fvecs'
and run different tests
./bin/jni_memory_test --gtest_filter=FaissNSGQueryMemoryTest.*
Issues Resolved
#946
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.