Case studies

In this page, we reference example use cases for Faiss, with some explanations. The examples will most often be in the form of Python notebooks.

Implementing an evolving IVF dataset

This script demonstrates how to add/remove elements from an IVF dataset in a rolling fashion. The key is to use a Hashtable as DirectMap type and remove with IDSelectorArray. Removal cost is then proportional to the number of elements to remove instead of number of elements in the dataset.

demo_rolling_dataset.ipynb

Fast indexing of 2M vectors for max inner product search

This script demonstates how to speed up a recommendation system. Conceptually, the queries vectors are users and the database vectors are items to recommend. The metric to "compare" them is maximum inner product, ie. which item is the most relevant for each user. There is a real-time constraint for this use case (should be returned in < 5 ms) and the accuracy should be as high as possible.

recommendation_2M.ipynb

Limited size clustering

This script demonstrates how to do a k-means variant where in addition the clusters are constrained to contain no more than a maximum number of points.

limited_size_clustering.ipynb

Asymmetric binary search

This script demonstrates an asymmetric search use case: the query vectors are in full precision and the database vectors are compressed as binary vectors. This implementation is slow, it is mainly intended to show how much accuracy can be regained with asymmetric search.

demo_asymmetric_binary.ipynb