Skip to content

Latest commit

 

History

History
26 lines (19 loc) · 1.03 KB

README.md

File metadata and controls

26 lines (19 loc) · 1.03 KB

Vector Search Datasets

This repository provides code to generate several datasets for similarity search benchmarking and evaluation on high-dimensional vectors stemming from recent deep learning models. The available datasets are:

Please see the details of each dataset in the respective README files.

References

[1] Aguerrebere, C.; Bhati I.; Hildebrand M.; Tepper M.; Willke T.:Similarity search in the blink of an eye with compressed indices. In: Proceedings of the VLDB Endowment, 16, 11, 3433 - 3446. (2023)

[2] Aguerrebere, C.; Hildebrand M.; Bhati I.; Willke T.; Tepper M..: Locally-adaptive Quantization for Streaming Vector Search. (2024) [arxiv]

[3] Tepper M.; Bhati I.; Aguerrebere, C.; Hildebrand M.; Willke T.: LeanVec: Search your vectors faster by making them fit. arXiv preprint arXiv:2312.16335 (2024)