Parallel build support #5449

cjac · 2024-08-08T18:10:55Z

Checklist

I added a descriptive title
I searched open requests and couldn't find a duplicate

What is the idea?

Conda could build packages in parallel. After an analysis of the DAG of package dependencies, leaf nodes and their hierarchy could be built in parallel. Most of my system is idle during installation of conda packages.

Why is this needed?

tests for rapids[1], which include installation of cudatools, dask, pandas and other ML tools take a very long time and spend a good portion of the workflow blocking on a single threaded application.

[1] GoogleCloudDataproc/initialization-actions#1219

What should happen?

The work should be broken down into a DAG and delegated to worker threads à la make -j$(nproc)

Additional Context

I appreciate the work done on parallelizing the package downloads. I've included export CONDA_FETCH_THREADS="$(nproc)" to accelerate that portion of the workflow.

The text was updated successfully, but these errors were encountered:

cjac · 2024-08-08T18:17:45Z

For the record, here is the command that's taking a while to run. I am running this on a rocky8 base image. I can gather metrics for the debian and ubuntu variants as well if that would help.

time conda create -n rapids-24.06 -c rapidsai -c conda-forge -c nvidia rapids=24.06 python=3.11 cuda-version=12.4

It was using more than the 15G of memory available to the n1-standard-4 machine type, and during some portions of the installation, CPU load was near 100% with the 4 processors, so I've increased the machine type to n1-standard-16.

This improves the performance of the GPU driver build script, which uses make -j$(nproc) to parallelize the nvidia kernel driver compilation process. With -j1, the build takes much more time than with -j16. I would hope that the same would be true of the conda build process, but it seems to be single-threaded.

cjac added the type::feature request for a new feature or capability label Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel build support #5449

Parallel build support #5449

cjac commented Aug 8, 2024

cjac commented Aug 8, 2024 •

edited

Loading

Parallel build support #5449

Parallel build support #5449

Comments

cjac commented Aug 8, 2024

Checklist

What is the idea?

Why is this needed?

What should happen?

Additional Context

cjac commented Aug 8, 2024 • edited Loading

cjac commented Aug 8, 2024 •

edited

Loading