Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern matching unintuitive #95

Open
khkk378 opened this issue May 25, 2021 · 2 comments
Open

Pattern matching unintuitive #95

khkk378 opened this issue May 25, 2021 · 2 comments

Comments

@khkk378
Copy link

khkk378 commented May 25, 2021

I think the pattern matching is a bit unintuitive. Say I have raw and processed files in a directory and I want to simulate from either of them. Intuitively I would set pattern to e.g. *_raw_counts.txt, but that will assume the matching cell type file will be called foo_celltypes.txt rather than foo_raw_celltypes.txt. It seems you remove the pattern from the filename and then append _celltypes.txt. I think a better option would be to replace _counts.txt with _celltypes.txt for files matching the pattern.

Cheers,
Rasmus

@KevinMenden
Copy link
Owner

Hmm yes true it can be a bit annoying. Maybe the solution would be to general improve this pattern matching. I think I wanted to give the option to manually list files at some point anyway. Although that won't be working for lots of files, there you still need patterns.

So maybe the best option is to be able to specify a --counts-pattern or a --celltype-pattern or both. If both are supplied, then it tries to find matching pairs, if only one is supplied, well it also tries to find matching pairs. Should be not too hard to add.

@khkk378
Copy link
Author

khkk378 commented May 25, 2021

Maybe easiest to just enforce that count files should end with _counts.txt and cell types with _celltypes.txt? I don't see the use case for a lot of flexibility there. I think it's more important to be able to make a flexible selection among a collection of datasets. Then use regexps to match the rest: --pattern foo/bar_(raw|processed) to select both raw and processed samples from foo/bar for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants