Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too Few IDs when dealing with external count matrices #422

Open
agomez700 opened this issue Jan 11, 2023 · 3 comments
Open

Too Few IDs when dealing with external count matrices #422

agomez700 opened this issue Jan 11, 2023 · 3 comments

Comments

@agomez700
Copy link

Hi DROP team!

I am preparing to get some fibroblast data (have previously used DROP for blood samples) but unlike my blood samples I do not have fibroblast controls so have downloaded what I need from your public resource here https://zenodo.org/record/6078397#.Y72BduzMLmE

However just to test that my sample annotation/config file are corrrect prior to adding my fibroblast data (just for dealing with external counts) I am trying to run snakemake exportCounts and aberrant Expression.
However regardless of what module I am trying to run, it is giving me an error that I have too few IDs.

Specific error is this:
ValueError in file /lab-share/Neph-Sampson-e2/Public/AC/drop_6/DROP_fibroblasts_hg38/Snakefile, line 12:
Too few IDs in DROP_GROUP fraser, please ensure that it has at least 10 IDs, groups: []
File "/lab-share/Neph-Sampson-e2/Public/AC/drop_6/DROP_fibroblasts_hg38/Snakefile", line 12, in
File "/lab-share/Neph-Sampson-e2/Public/AC/drop_6/mambaforge/envs/drop_env6/lib/python3.11/site-packages/drop/config/DropConfig.py", line 65, in init
File "/lab-share/Neph-Sampson-e2/Public/AC/drop_6/mambaforge/envs/drop_env6/lib/python3.11/site-packages/drop/config/submodules/AberrantSplicing.py", line 31, in init
File "/lab-share/Neph-Sampson-e2/Public/AC/drop_6/mambaforge/envs/drop_env6/lib/python3.11/site-packages/drop/config/submodules/Submodules.py", line 53, in checkSubset

Here is my sample annotation file:
fibroblast_sample_annotation_updated.xlsx

And here is my config file:
config.txt

and my snakefile:
Snakefile.txt

Any ideas?
If it would be better to sort this out after I add my own fibroblast samples I can do that too, but I am worried that if I add them now it still won't be dealing appropriately with the external count matrices as it doesn't seem to right now.

Thanks!

@vyepez88
Copy link
Collaborator

Hi,
I can spot an error in your sample annotation. The GENE_ANNOTATION should contain the key to the gtf file in your config file and not the file path. Simply replace the file path with v29 (as this is the value in your config file). Refer to the documentation for an example: https://gagneurlab-drop.readthedocs.io/en/latest/prepare.html#external-count-examples

@agomez700
Copy link
Author

Thank you. I have fixed this in sample annotation but still have the same error of too few IDs in DROP_GROUP fraser (error message has not changed)

@agomez700
Copy link
Author

Hey

Just wanted to update:
Still getting this error
WARNING: 1 files missing in samples annotation. Ignoring...
ValueError in file /lab-share/Neph-Sampson-e2/Public/AC/drop_6/DROP_fibroblasts_hg38/Snakefile, line 12:
Too few IDs in DROP_GROUP fraser, please ensure that it has at least 10 IDs, groups: ['BCH-22-05911-02', 'BCH-22-05911-01', 'BCH-22-05911-03']
File "/lab-share/Neph-Sampson-e2/Public/AC/drop_6/DROP_fibroblasts_hg38/Snakefile", line 12, in
File "/lab-share/Neph-Sampson-e2/Public/AC/drop_6/mambaforge/envs/drop_env6/lib/python3.11/site-packages/drop/config/DropConfig.py", line 65, in init
File "/lab-share/Neph-Sampson-e2/Public/AC/drop_6/mambaforge/envs/drop_env6/lib/python3.11/site-packages/drop/config/submodules/AberrantSplicing.py", line 31, in init
File "/lab-share/Neph-Sampson-e2/Public/AC/drop_6/mambaforge/envs/drop_env6/lib/python3.11/site-packages/drop/config/submodules/Submodules.py", line 53, in checkSubset

I have now added my 3 samples so this is my new sample annotation.
Otherwise the config file and snakefile are the same as above

fibroblast_sample_annotation_updated.xlsx

It still seems to be an issue with incorporating the external dataset.
For reference, this is when I attempt to run snakemake sampleAnnotation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants