Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No empty droplets error when genes manually appended #116

Closed
saxovocal opened this issue Oct 21, 2021 · 2 comments
Closed

No empty droplets error when genes manually appended #116

saxovocal opened this issue Oct 21, 2021 · 2 comments
Assignees
Labels
user question User question about a specific dataset

Comments

@saxovocal
Copy link

I am trying to use CellBender to estimate ambient RNA particular for the purposes of estimating ambient viral RNA.

The viral UMIs are calculated separately by the Viral-Track algorithm, so I have proceeded to append the viral UMIs to the raw_matrix output of CellRanger manually, via rhdf5.

The original CellRanger output runs fine, as follows:

cellbender:remove-background: Command:
cellbender remove-background --input dir/raw_feature_bc_matrix.h5 --output dir/D3_cellbendercleaned_output.h5 --expected-cells 21000 --total-droplets-included 30000 --fpr 0.01 --epochs 150
cellbender:remove-background: 2021-10-21 17:51:11
cellbender:remove-background: Running remove-background
cellbender:remove-background: Loading data from file dir/raw_feature_bc_matrix.h5
cellbender:remove-background: CellRanger v3 format
cellbender:remove-background: Trimming dataset for inference.
cellbender:remove-background: Including 27487 genes that have nonzero counts.
cellbender:remove-background: Prior on counts in empty droplets is 17
cellbender:remove-background: Prior on counts for cells is 6482
cellbender:remove-background: Excluding barcodes with counts below 15
cellbender:remove-background: Using 21000 probable cell barcodes, plus an additional 9000 barcodes, and 145385 empty droplets.
cellbender:remove-background: Largest surely-empty droplet has 387 UMI counts.
cellbender:remove-background: Running inference...

However, when running CellBender with the .h5 that has the viral genes appended, the error comes out. I have tried to lower threshold to 1, but still has issues.

cellbender remove-background --input dir/D3_bamextracted_viradd.h5 --output dir/D3_bamextracted_viradd_output.h5 --expected-cells 26000 --total-droplets-included 35000 --fpr 0.01 --epochs 150 --low-count-threshold 1
cellbender:remove-background: 2021-10-21 17:38:25
cellbender:remove-background: Running remove-background
cellbender:remove-background: Loading data from file dir/D3_bamextracted_viradd.h5
cellbender:remove-background: CellRanger v3 format
cellbender:remove-background: Trimming dataset for inference.
cellbender:remove-background: Including 27590 genes that have nonzero counts.
cellbender:remove-background: Prior on counts in empty droplets is 40133
cellbender:remove-background: Prior on counts for cells is 12386
cellbender:remove-background: Excluding barcodes with counts below 20066
Traceback (most recent call last):
  File "/home/wharton2/.conda/envs/CellBender/bin/cellbender", line 33, in <module>
    sys.exit(load_entry_point('cellbender', 'console_scripts', 'cellbender')())
  File "/media/wharton2/2020HDD1/Software/CellBender/cellbender/base_cli.py", line 101, in main
    cli_dict[args.tool].run(args)
  File "/media/wharton2/2020HDD1/Software/CellBender/cellbender/remove_background/cli.py", line 109, in run
    main(args)
  File "/media/wharton2/2020HDD1/Software/CellBender/cellbender/remove_background/cli.py", line 204, in main
    run_remove_background(args)
  File "/media/wharton2/2020HDD1/Software/CellBender/cellbender/remove_background/cli.py", line 159, in run_remove_background
    fpr=args.fpr)
  File "/media/wharton2/2020HDD1/Software/CellBender/cellbender/remove_background/data/dataset.py", line 101, in __init__
    gene_blacklist=gene_blacklist)
  File "/media/wharton2/2020HDD1/Software/CellBender/cellbender/remove_background/data/dataset.py", line 281, in _trim_dataset_for_analysis
    f"There are no empty droplets with UMI counts over the lower " \
AssertionError: There are no empty droplets with UMI counts over the lower cutoff of 20066.  Some empty droplets are necessary for the analysis.  Reduce the --low-count-threshold parameter.

Any idea how to fix this? I also attach the cellranger output plot

newplot

Thanks a lot for your help!

@sjfleming
Copy link
Member

Hi @saxovocal ,

I can see what is going wrong, but I am not yet sure why it's happening. In your first run, you see this in the log

cellbender:remove-background: Prior on counts in empty droplets is 17
cellbender:remove-background: Prior on counts for cells is 6482

...this seems about right based on that UMI curve you have above.

But when you run with the viral UMIs added, you see this in the log

cellbender:remove-background: Prior on counts in empty droplets is 40133
cellbender:remove-background: Prior on counts for cells is 12386

so cellbender thinks something really strange is going on. It could be a cellbender problem, but I wonder if something funny has happened when you try to append the viral UMIs to the dataset.

Can you make your own UMI curve plot both before and after appending the viral UMIs to the count matrix? I am suspicious that maybe the UMI curve after appending the viral UMIs has changed in a very unexpected way.

@sjfleming
Copy link
Member

Closed by #238

I think this kind of thing will be fixed in v0.3.0

@sjfleming sjfleming self-assigned this Aug 8, 2023
@sjfleming sjfleming added the user question User question about a specific dataset label Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user question User question about a specific dataset
Projects
None yet
Development

No branches or pull requests

2 participants