Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I request cellbender to limit/use more threads/cores? #160

Closed
angelasanzo opened this issue Oct 11, 2022 · 8 comments
Closed

How can I request cellbender to limit/use more threads/cores? #160

angelasanzo opened this issue Oct 11, 2022 · 8 comments
Assignees
Labels
enhancement New feature or improvement

Comments

@angelasanzo
Copy link

angelasanzo commented Oct 11, 2022

Hi there,

I have tried to find a way to limit or increase the number of threads that is using, but have not found a result. Any argument that can be provided?

Thank you very much.

Angela

@angelasanzo angelasanzo changed the title How can I request cellbender to use more threads/cores? How can I request cellbender to limit/use more threads/cores? Oct 11, 2022
@sjfleming
Copy link
Member

Hi @angelasanzo , unfortunately, I don't know of a good way to make this happen. CellBender does not have an input argument to enable this currently.

(If anybody else knows how to do this, please post here!)
It looks like there might be some ways to get pytorch to run using multiple threads on CPU. Unfortunately I have never tried this, and don't know the benefits / limitations.

I'm guessing you are running on a CPU? Is that right?

@sjfleming
Copy link
Member

Note to self: it is possible this is as simple as

torch.set_num_threads(n)

Test this.

@sjfleming sjfleming self-assigned this Oct 13, 2022
@sjfleming sjfleming added the enhancement New feature or improvement label Oct 13, 2022
@sjfleming
Copy link
Member

@angelasanzo
Copy link
Author

angelasanzo commented Oct 19, 2022

Thanks a lot for your work! Yes, we are using CPU as CUDA cannot be performed under our graphics card.

@sjfleming
Copy link
Member

Unfortunately, I tested

import psutil
n_jobs = psutil.cpu_count(logical=False)  # get physical cores
if n_jobs is None:
    n_jobs = psutil.cpu_count(logical=True)  # if undetermined, use logical cores instead
torch.set_num_threads(n_jobs)

and it doesn't make any difference in terms of runtime. So it's not that simple...

Will keep thinking...

@sjfleming
Copy link
Member

Eh, okay, it's possible that using the number of logical cores is better. But I'm only seeing a speedup of like 4% :(

Still, I might include it as an input argument in the future. Default will use the number of logical cores.

@sjfleming sjfleming mentioned this issue Mar 28, 2023
@sjfleming sjfleming mentioned this issue Aug 6, 2023
@sjfleming
Copy link
Member

Closed by #238
The input argument is --cpu-threads. But limited testing shows me it unfortunately does not seem to make a big difference.

@BradBalderson
Copy link

BradBalderson commented Dec 4, 2023

Hey @sjfleming,

Maybe off topic but it seems that the --cpu-threads is causing a stall when writing out the output? I ran cellbender like this:

cellbender remove-background --cuda --estimator-multiple-cpu --cpu-threads 21 --epochs 40 --checkpoint-mins 60 --input {sample_input_counts} --output {sample_out_dir}

It gives me this output:
cellbender:remove-background: Command:
cellbender remove-background --cuda --estimator-multiple-cpu --cpu-threads 21 --epochs 40 --checkpoint-mins 60 --input /home/jovyan/data4/UCSD_Oxy-multiome/Oxy-multiome//FTL_702_M957/outs/raw_feature_bc_matrix.h5 --output /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/
cellbender:remove-background: CellBender 0.3.0
cellbender:remove-background: (Workflow hash 8a1259eac2)
cellbender:remove-background: 2023-12-04 17:46:12
cellbender:remove-background: Running remove-background
cellbender:remove-background: Loading data from /home/jovyan/data4/UCSD_Oxy-multiome/Oxy-multiome//FTL_702_M957/outs/raw_feature_bc_matrix.h5
cellbender:remove-background: CellRanger v3 format
cellbender:remove-background: Features in dataset: 33294 Gene Expression, 96665 Peaks
cellbender:remove-background: Trimming features for inference.
cellbender:remove-background: 122701 features have nonzero counts.
cellbender:remove-background: Prior on counts for cells is 4609
cellbender:remove-background: Prior on counts for empty droplets is 72
cellbender:remove-background: Excluding 11105 features that are estimated to have <= 0.1 background counts in cells.
cellbender:remove-background: Including 111596 features in the analysis.
cellbender:remove-background: Trimming barcodes for inference.
cellbender:remove-background: Excluding barcodes with counts below 36
cellbender:remove-background: Using 10437 probable cell barcodes, plus an additional 24850 barcodes, and 49210 empty droplets.
cellbender:remove-background: Largest surely-empty droplet has 126 UMI counts.
cellbender:remove-background: Attempting to unpack tarball "ckpt.tar.gz" to /tmp/tmpjiir3c4e
cellbender:remove-background: Successfully unpacked tarball to /tmp/tmpjiir3c4e
/tmp/tmpjiir3c4e/8a1259eac2_params.pyro
/tmp/tmpjiir3c4e/8a1259eac2_random.cuda
/tmp/tmpjiir3c4e/8a1259eac2_optim.torch
/tmp/tmpjiir3c4e/posterior.h5
/tmp/tmpjiir3c4e/8a1259eac2_args.npy
/tmp/tmpjiir3c4e/8a1259eac2_optim.pyro
/tmp/tmpjiir3c4e/8a1259eac2_model.torch
/tmp/tmpjiir3c4e/8a1259eac2_train.loaderstate
/tmp/tmpjiir3c4e/8a1259eac2_test.loaderstate
/tmp/tmpjiir3c4e/8a1259eac2_random.pyro
cellbender:remove-background: Loaded partially-trained checkpoint from ckpt.tar.gz
cellbender:remove-background: Checkpoint loaded successfully.
cellbender:remove-background: Running inference...
cellbender:remove-background: 2023-12-04 17:47:38
cellbender:remove-background: Inference procedure complete.
cellbender:remove-background: Attempting to unpack tarball "ckpt.tar.gz" to /tmp/tmpheuaf78f
cellbender:remove-background: Successfully unpacked tarball to /tmp/tmpheuaf78f
/tmp/tmpheuaf78f/8a1259eac2_params.pyro
/tmp/tmpheuaf78f/8a1259eac2_random.cuda
/tmp/tmpheuaf78f/8a1259eac2_optim.torch
/tmp/tmpheuaf78f/posterior.h5
/tmp/tmpheuaf78f/8a1259eac2_args.npy
/tmp/tmpheuaf78f/8a1259eac2_optim.pyro
/tmp/tmpheuaf78f/8a1259eac2_model.torch
/tmp/tmpheuaf78f/8a1259eac2_train.loaderstate
/tmp/tmpheuaf78f/8a1259eac2_test.loaderstate
/tmp/tmpheuaf78f/8a1259eac2_random.pyro
cellbender:remove-background: Loaded pre-computed posterior from posterior.h5
cellbender:remove-background: 2023-12-04 17:48:53

cellbender:remove-background: Saved summary plots as /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/.pdf
cellbender:remove-background: Saved cell barcodes in /home/jovyan/data4/bbalderson_runAI/rat_multiomics/data/cellbender_out//FTL_702_M957/_cell_barcodes.csv
cellbender:remove-background: Computing target noise counts per gene for MCKP estimator
cellbender:remove-background: Using MCKP noise targets computed for FPR 0.01
cellbender:remove-background: Computing denoised counts using mckp estimator
cellbender:remove-background: Dividing dataset into chunks of genes

It seems to stall here, 21 threads are started, but they are using 0% CPU, and it does not write any output beyond the .pdf and the _cell_barcodes.tsv. I am about to try not setting the number of threads to see if it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or improvement
Projects
None yet
Development

No branches or pull requests

3 participants