Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cudf, dask and dask-cudf Canvas.line benchmarks #1140

Merged
merged 2 commits into from
Oct 21, 2022
Merged

Add cudf, dask and dask-cudf Canvas.line benchmarks #1140

merged 2 commits into from
Oct 21, 2022

Conversation

ianthomas23
Copy link
Member

This PR adds cudf, dask and dask-cudf benchmarks to the existing pandas Canvas.line benchmarks. The cudf and dask-cudf benchmarks are only run if you have the required libraries installed, and in turn those libraries can only be installed if you have appropriate CUDA hardware available.

Outline of process to install required libraries:

  1. Identify the version of your graphics card driver and also the CUDA version it supports using e.g. nvidia-smi. The versions should be something like 470.141.03 and 11.5.
  2. Navigate to https://rapids.ai/start.html, scroll down to Step 1 and check that the versions you have are supported by RAPIDS.
  3. Scroll down to Step 3, input your CUDA version and required Python version, and select the cuDF package. This gives you a conda create command to use. Note that this does not include dask-cudf, so explicitly append it to the conda create command if you wish to use it.
  4. Create the conda environment using this command. It can take quite a while to resolve dependencies and download and install the required packages.
  5. Install datashader into this environment using pip install -ve .[tests].
  6. Check that the datashader test suite passes using DATASHADER_TEST_GPU=1 pytest datashader/tests.
  7. Now you can run the benchmarking suite following the notes in benchmarks/README.md.

Note that the size of the benchmark problems is not sufficient to justify the use of dask and/or cudf. They are not yet intended to fully benchmark the performance of the whole library but are instead intended to check that code changes do not have a detrimental effect on individual algorithm performance. This is important in the short term as there is work underway to simplify the Numba code within Datashader so that it is easier to understand and maintain, but these simplifications will only be acceptable if they do not slow down the code.

@ianthomas23 ianthomas23 merged commit 4b80d52 into holoviz:master Oct 21, 2022
@ianthomas23 ianthomas23 deleted the cuda_benchmarks branch October 21, 2022 08:43
@ianthomas23 ianthomas23 added this to the v0.14.3 milestone Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant