Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAGRA Python API #35

Merged
merged 48 commits into from
Mar 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
bb1d078
Removing Python from build to get CI passing (#31)
cjnolet Feb 6, 2024
fb08679
Fixing googletests (#32)
cjnolet Feb 7, 2024
a641797
FEA First commit on top of 24.04
dantegd Feb 16, 2024
a843e6b
FIX small fixes
dantegd Feb 16, 2024
e2cca0b
Multiple Cython and CMake improvements, fixes and code cleanup
dantegd Feb 21, 2024
667cef5
Multiple small fixes and improvements
dantegd Feb 23, 2024
a8de38f
FIX commented code
dantegd Feb 23, 2024
7be7ec5
ENH fixes and enhancements from live review
dantegd Feb 27, 2024
d6929a9
Merge branch-24.04
dantegd Feb 27, 2024
3fbd19d
FIX style fixes
dantegd Feb 27, 2024
244350e
FIX typo in parameter
dantegd Feb 27, 2024
3136d55
Multiple Cython, build and CI improvements and fixes
dantegd Mar 1, 2024
2b3500c
ENH Re-enable wheels CI
dantegd Mar 1, 2024
e11cfb4
FIX style fixes
dantegd Mar 1, 2024
bda4f82
FIX scikit-build-core and other small wheel dependencies that I forgo…
dantegd Mar 1, 2024
7a966b1
Merge branch 'branch-24.04' into 2404-python-api
dantegd Mar 1, 2024
112d084
Update test_python.sh for wrongly commented line
dantegd Mar 1, 2024
f6cb1a8
FIX update versions of rmm and pylibraft
dantegd Mar 1, 2024
8fcbf21
Merge branch '2404-python-api' of github.com:dantegd/cuvs into 2404-p…
dantegd Mar 1, 2024
415d70c
FIX add pip nvidia nightly index:
dantegd Mar 1, 2024
42e8284
DBG more index fixes for wheel builds
dantegd Mar 1, 2024
6208987
DBG large updates to dependencies.yaml and cuda 12.2 updates
dantegd Mar 1, 2024
efdaf58
DBG more changes to dep file to make wheels happy
dantegd Mar 1, 2024
bb08738
FIX rdfg fixes
dantegd Mar 1, 2024
dd8e0b0
FIX rdfg fixes
dantegd Mar 1, 2024
bcccc6f
FIX rdfg fixes
dantegd Mar 1, 2024
3c4ef52
FIX wheel CI fixes
dantegd Mar 1, 2024
95a6715
FIX remove conditional in wheel CI
dantegd Mar 1, 2024
27774df
FIX undo prior changes to dependencies and update it now that wheel p…
dantegd Mar 1, 2024
fca0d2e
FIX style fixes
dantegd Mar 1, 2024
afc9b94
FIX simplify matrix of build in deps
dantegd Mar 1, 2024
109e09d
DBG deps debugging
dantegd Mar 1, 2024
d1c06d6
FIX pre-commit hook was pointing to old version of rfdg
dantegd Mar 2, 2024
41be870
FIX CI fixes
dantegd Mar 2, 2024
d2f0ffb
FIX CI fixes
dantegd Mar 2, 2024
835078f
FIX passing c api flag from python to build it in wheel builds
dantegd Mar 2, 2024
940370d
FIX Install libcuvs_c.so inside the wheel
dantegd Mar 2, 2024
95002e1
FIX RAFT C++ static build for wheels
dantegd Mar 3, 2024
56d5455
FIX cuvs variable name
dantegd Mar 3, 2024
f504af3
FIX small fix for conda CI to upload artifacts and wheel cmake fixes
dantegd Mar 3, 2024
3f08f1e
FIX libcuvs_c linked libraries for wheel static build
dantegd Mar 3, 2024
e2d5b8e
FIX use static raft target for wheels
dantegd Mar 3, 2024
198f462
FIX use static raft target for wheels in c api
dantegd Mar 3, 2024
53a19b3
FIX remove static raft target for wheels in c api
dantegd Mar 3, 2024
09f11c6
FIX cython install path of libcuvs and libcuvs_c
dantegd Mar 3, 2024
2581814
FIX cython install path of libcuvs and libcuvs_c
dantegd Mar 3, 2024
9824dff
FIX rpath of libcuvs targets
dantegd Mar 3, 2024
ff29ec7
FIX install correct cupy in cuda 12 wheel tests
dantegd Mar 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .devcontainer/cuda11.8-conda/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"args": {
"CUDA": "11.8",
"PYTHON_PACKAGE_MANAGER": "conda",
"BASE": "rapidsai/devcontainers:24.02-cpp-llvm16-cuda11.8-mambaforge-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.04-cpp-llvm16-cuda11.8-mambaforge-ubuntu22.04"
}
},
"hostRequirements": {"gpu": "optional"},
Expand Down
2 changes: 1 addition & 1 deletion .devcontainer/cuda11.8-pip/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"args": {
"CUDA": "11.8",
"PYTHON_PACKAGE_MANAGER": "pip",
"BASE": "rapidsai/devcontainers:24.02-cpp-llvm16-cuda11.8-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.04-cpp-llvm16-cuda11.8-ubuntu22.04"
}
},
"hostRequirements": {"gpu": "optional"},
Expand Down
2 changes: 1 addition & 1 deletion .devcontainer/cuda12.0-conda/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"args": {
"CUDA": "12.0",
"PYTHON_PACKAGE_MANAGER": "conda",
"BASE": "rapidsai/devcontainers:24.02-cpp-mambaforge-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.04-cpp-mambaforge-ubuntu22.04"
}
},
"hostRequirements": {"gpu": "optional"},
Expand Down
2 changes: 1 addition & 1 deletion .devcontainer/cuda12.0-pip/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"args": {
"CUDA": "12.0",
"PYTHON_PACKAGE_MANAGER": "pip",
"BASE": "rapidsai/devcontainers:24.02-cpp-llvm16-cuda12.0-ubuntu22.04"
"BASE": "rapidsai/devcontainers:24.04-cpp-llvm16-cuda12.0-ubuntu22.04"
}
},
"hostRequirements": {"gpu": "optional"},
Expand Down
12 changes: 6 additions & 6 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ concurrency:
jobs:
cpp-build:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-24.04
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -37,7 +37,7 @@ jobs:
python-build:
needs: [cpp-build]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-24.04
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -46,7 +46,7 @@ jobs:
upload-conda:
needs: [cpp-build, python-build]
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-upload-packages.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-upload-packages.yaml@branch-24.04
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -57,7 +57,7 @@ jobs:
if: github.ref_type == 'branch'
needs: python-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.04
with:
arch: "amd64"
branch: ${{ inputs.branch }}
Expand All @@ -69,7 +69,7 @@ jobs:
sha: ${{ inputs.sha }}
wheel-build-cuvs:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.04
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand All @@ -79,7 +79,7 @@ jobs:
wheel-publish-cuvs:
needs: wheel-build-cuvs
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-24.04
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
Expand Down
22 changes: 11 additions & 11 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,49 +23,49 @@ jobs:
- wheel-tests-cuvs
- devcontainer
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/pr-builder.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/pr-builder.yaml@branch-24.04
checks:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/checks.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/checks.yaml@branch-24.04
with:
enable_check_generated_files: false
conda-cpp-build:
needs: checks
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-24.04
with:
build_type: pull-request
node_type: cpu16
conda-cpp-tests:
needs: conda-cpp-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@branch-24.04
with:
build_type: pull-request
conda-cpp-checks:
needs: conda-cpp-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@branch-24.04
with:
build_type: pull-request
enable_check_symbols: true
symbol_exclusions: (void (thrust::|cub::)|_ZN\d+raft_cutlass)
conda-python-build:
needs: conda-cpp-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-24.04
with:
build_type: pull-request
conda-python-tests:
needs: conda-python-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@branch-24.04
with:
build_type: pull-request
docs-build:
needs: conda-python-build
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-24.04
with:
build_type: pull-request
node_type: "gpu-v100-latest-1"
Expand All @@ -75,20 +75,20 @@ jobs:
wheel-build-cuvs:
needs: checks
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-24.04
with:
build_type: pull-request
script: ci/build_wheel_cuvs.sh
wheel-tests-cuvs:
needs: wheel-build-cuvs
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-24.04
with:
build_type: pull-request
script: ci/test_wheel_cuvs.sh
devcontainer:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/build-in-devcontainer.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/build-in-devcontainer.yaml@branch-24.04
with:
build_command: |
sccache -z;
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ on:
jobs:
conda-cpp-checks:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@branch-24.04
with:
build_type: nightly
branch: ${{ inputs.branch }}
Expand All @@ -26,23 +26,23 @@ jobs:
symbol_exclusions: (void (thrust::|cub::)|_ZN\d+raft_cutlass)
conda-cpp-tests:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@branch-24.04
with:
build_type: nightly
branch: ${{ inputs.branch }}
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
conda-python-tests:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@branch-24.04
with:
build_type: nightly
branch: ${{ inputs.branch }}
date: ${{ inputs.date }}
sha: ${{ inputs.sha }}
wheel-tests-cuvs:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-24.02
uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-24.04
with:
build_type: nightly
branch: ${{ inputs.branch }}
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ repos:
args: ["--toml", "pyproject.toml"]
exclude: (?x)^(^CHANGELOG.md$)
- repo: https://github.com/rapidsai/dependency-file-generator
rev: v1.5.1
rev: v1.8.0
hooks:
- id: rapids-dependency-file-generator
args: ["--clean"]
Expand Down
150 changes: 141 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,14 @@
# <div align="left"><img src="https://rapids.ai/assets/images/rapids_logo.png" width="90px"/>&nbsp;cuVS: Vector Search and Clustering on the GPU</div>

### NOTE: cuVS is currently being

## Contents
<hr>

1. [Useful Resources](#useful-resources)
2. [What is cuVS?](#what-is-cuvs)
3. [Getting Started](#getting-started)
4. [Installing cuVS](#installing)
3. [Installing cuVS](#installing)
4. [Getting Started](#getting-started)
5. [Contributing](#contributing)
6. [References](#references)

<hr>

## Useful Resources

- [cuVS Reference Documentation](https://docs.rapids.ai/api/cuvs/stable/): API Documentation.
Expand All @@ -26,15 +21,152 @@

## What is cuVS?

cuVS contains many algorithms for running approximate nearest neighbors and clustering on the GPU.
cuVS contains state-of-the-art implementations of several algorithms for running approximate nearest neighbors and clustering on the GPU. It can be used directly or through the various databases and other libraries that have integrated it. The primary goal of cuVS is to simplify the use of GPUs for vector similarity search and clustering.

**Please note** that cuVS is a new library mostly derived from the approximate nearest neighbors and clustering algorithms in the [RAPIDS RAFT](https://github.com/rapidsai) library of data mining primitives. RAPIDS RAFT currently contains the most fully-featured versions of the approximate nearest neighbors and clustering algorithms in cuVS. We are in the process of migrating the algorithms from RAFT to cuVS, but if you are unsure of which to use, please consider the following:
1. RAFT contains C++ and Python APIs for all of the approximate nearest neighbors and clustering algorithms.
2. cuVS contains a growing support for different languages, including C, C++, Python, and Rust. We will be adding more language support to cuVS in the future but will not be improving the language support for RAFT.
3. Once all of RAFT's approximate nearest neighbors and clustering algorithms are moved to cuVS, the RAFT APIs will be deprecated and eventually removed altogether. Once removed, RAFT will become a lightweight header-only library. In the meantime, there's no harm in using RAFT if support for additional languages is not needed.

## Installing cuVS

cuVS comes with pre-built packages that can be installed through [conda](https://conda.io/projects/conda/en/latest/user-guide/getting-started.html#managing-python). Different packages are available for the different languages supported by cuVS:

| Python | C++ | C | Rust |
|--------|-----|---|------|
| `pycuvs`| `libcuvs` | `libcuvs_c` | `cuvs-rs` |

### Stable release

It is recommended to use [mamba](https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html) to install the desired packages. The following command will install the Python package. You can substitute `pycuvs` for any of the packages in the table above:
```bash
mamba install -c conda-forge -c nvidia -c rapidsai pycuvs
```

### Nightlies
If installing a version that has not yet been released, the `rapidsai` channel can be replaced with `rapidsai-nightly`:
```bash
mamba install -c conda-forge -c nvidia -c rapidsai-nightly pycuvs=24.04*
```

Please see the [Build and Install Guide](docs/source/build.md) for more information on installing cuVS and building from source.

## Getting Started

The following code snippets train an approximate nearest neighbors index for the CAGRA algorithm.

### Python API

```python
from cuvs.neighbors import cagra

dataset = load_data()
index_params = cagra.IndexParams()

index = cagra.build_index(build_params, dataset)
```

### C++ API

```c++
#include <cuvs/neighbors/cagra.hpp>

using namespace cuvs::neighbors;

raft::device_matrix_view<float> dataset = load_dataset();
raft::device_resources res;

cagra::index_params index_params;

auto index = cagra::build(res, index_params, dataset);
```

For more example of the C++ APIs, refer to [cpp/examples](https://github.com/rapidsai/cuvs/tree/HEAD/cpp/examples) directory in the codebase.

### C API

```c
#include <cuvs/neighbors/cagra.h>

cuvsResources_t res;
cuvsCagraIndexParams_t index_params;
cuvsCagraIndex_t index;

DLManagedTensor *dataset;
load_dataset(dataset);

cuvsResourcesCreate(&res);
cuvsCagraIndexParamsCreate(&index_params);
cuvsCagraIndexCreate(&index);

cuvsCagraBuild(res, index_params, dataset, index);

cuvsCagraIndexDestroy(index);
cuvsCagraIndexParamsDestroy(index_params);
cuvsResourcesDestroy(res);
```

## Installing cuVS

## Contributing

If you are interested in contributing to the cuVS library, please read our [Contributing guidelines](docs/source/contributing.md). Refer to the [Developer Guide](docs/source/developer_guide.md) for details on the developer guidelines, workflows, and principals.

## References

When citing cuVS generally, please consider referencing this Github repository.
```bibtex
@misc{rapidsai,
title={Rapidsai/cuVS: Vector Search and Clustering on the GPU.},
url={https://github.com/rapidsai/cuvs},
journal={GitHub},
publisher={Nvidia RAPIDS},
author={Rapidsai},
year={2024}
}
```

If citing CAGRA, please consider the following bibtex:
```bibtex
@misc{ootomo2023cagra,
title={CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs},
author={Hiroyuki Ootomo and Akira Naruse and Corey Nolet and Ray Wang and Tamas Feher and Yong Wang},
year={2023},
eprint={2308.15136},
archivePrefix={arXiv},
primaryClass={cs.DS}
}
```

If citing the k-selection routines, please consider the following bibtex:
```bibtex
@proceedings{10.1145/3581784,
title = {SC '23: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},
year = {2023},
isbn = {9798400701092},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
abstract = {Started in 1988, the SC Conference has become the annual nexus for researchers and practitioners from academia, industry and government to share information and foster collaborations to advance the state of the art in High Performance Computing (HPC), Networking, Storage, and Analysis.},
location = {, Denver, CO, USA, }
}
```

If citing the nearest neighbors descent API, please consider the following bibtex:
```bibtex
@inproceedings{10.1145/3459637.3482344,
author = {Wang, Hui and Zhao, Wan-Lei and Zeng, Xiangxiang and Yang, Jianye},
title = {Fast K-NN Graph Construction by GPU Based NN-Descent},
year = {2021},
isbn = {9781450384469},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3459637.3482344},
doi = {10.1145/3459637.3482344},
abstract = {NN-Descent is a classic k-NN graph construction approach. It is still widely employed in machine learning, computer vision, and information retrieval tasks due to its efficiency and genericness. However, the current design only works well on CPU. In this paper, NN-Descent has been redesigned to adapt to the GPU architecture. A new graph update strategy called selective update is proposed. It reduces the data exchange between GPU cores and GPU global memory significantly, which is the processing bottleneck under GPU computation architecture. This redesign leads to full exploitation of the parallelism of the GPU hardware. In the meantime, the genericness, as well as the simplicity of NN-Descent, are well-preserved. Moreover, a procedure that allows to k-NN graph to be merged efficiently on GPU is proposed. It makes the construction of high-quality k-NN graphs for out-of-GPU-memory datasets tractable. Our approach is 100-250\texttimes{} faster than the single-thread NN-Descent and is 2.5-5\texttimes{} faster than the existing GPU-based approaches as we tested on million as well as billion scale datasets.},
booktitle = {Proceedings of the 30th ACM International Conference on Information \& Knowledge Management},
pages = {1929–1938},
numpages = {10},
keywords = {high-dimensional, nn-descent, gpu, k-nearest neighbor graph},
location = {Virtual Event, Queensland, Australia},
series = {CIKM '21}
}
```
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
24.02.00
24.04.00
Loading
Loading