Hierarchical Navigable Small World Graph for fast ANN search
Enable the serde
feature to serialize and deserialize HNSW
.
A good default for M and M0 parameters is 12 and 24 respectively. According to the paper, M0 should always be double M, but you can change both of them freely.
To see how this might be used with hamming space, see tests/simple_discrete.rs
. To see how this might be used with euclidean space, see tests/simple.rs
.
Note that the euclidean implementation in the test may have some numerical errors and fail to implement the triangle inequality, especially on high dimensionality. Use a Kahan sum instead for proper usage. It also may not utilize SIMD, but using an array may help with that.
Please refer to the space
documentation for the trait and types regarding distance. It also contains special Bits128
- Bits4096
tuple structs that wrap an array of bytes and enable SIMD capability. Benchmarks provided use these SIMD impls.
Here is a recall graph that you can compare to its alternatives:
For more benchmarks and how to benchmark, see benchmarks.md
.
This is based on the paper "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs" by Yu. A. Malkov and D. A. Yashunin. This paper builds on the original paper for NSW. There are multiple papers written by the authors on NSW, which preceeded HNSW.
For more details about parameters and details of the implementation, see implementation.md
.
This is in no way a direct copy or reimplementation of the original implementation. This was made purely based on the paper without reference to the original headers. The paper is very well written and easy to understand, with some minor exceptions. Thank you to the authors for your valuable contribution.
Please visit the Rust CV Discord.