Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable pangolin to read from stdin #334

Merged
merged 8 commits into from
Oct 28, 2021

Conversation

matt-sd-watson
Copy link
Contributor

This aims to resolve #329.

pangolin can now read in stdin from a bash pipe with "-", as requested in #329. If no query fasta or stdin is specified, such as :
pangolin --outfile lineage_report.csv
pangolin will output:
Error: input query fasta could not be detected from a filepath or through stdin. Please enter your fasta sequence file and refer to pangolin usage at: https://cov-lineages.org/pangolin.html for detailed instructions.

If "-" is used to designate stdin, but there is no stdin for pangolin to use, i.e. if nothing is piped into the pangolin command:
pangolin --outfile lineage_report.csv -

it will output:
Error: cannot find query (input) fasta file using stdin. Please enter your fasta sequence file and refer to pangolin usage at: https://cov-lineages.org/pangolin.html for detailed instructions.

The methods for dealing with file paths that do not exist remain the same as before.

If the user attempts to pass a compressed stdin, such as:
cat test.fa | gzip | pangolin --outfile lineage_report.csv -

pangolin will report:

Error: error when reading query fasta. It is possible that compressed stdin was passed.

A important note that is this error message above will be thrown if any of the inputs (compressed or not) cannot ultimately be read as a FASTA file by SeqIO. This may want to be changed in the future but I tried my best to allow pangolin to differentiate between stdin and filepaths and the compression status of either input.

@aineniamh aineniamh self-requested a review October 26, 2021 08:33
@matt-sd-watson matt-sd-watson marked this pull request as draft October 26, 2021 13:34
@matt-sd-watson matt-sd-watson marked this pull request as ready for review October 26, 2021 20:25
@matt-sd-watson
Copy link
Contributor Author

I made some changes based on the errors given by the initial pytest, and have tested extensively with various compressed and uncompressed inputs to verify the proper behaviour and desired error handling. Everything seems to work as intended on my end but let me know if there see to be any glaring problems!

@aineniamh
Copy link
Member

This looks really great! Just started the tests running, but when they complete I'll merge in! Thanks for this @matt-sd-watson 🎉

@aineniamh aineniamh merged commit 2b7963d into cov-lineages:master Oct 28, 2021
@matt-sd-watson matt-sd-watson deleted the stdin branch October 29, 2021 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow pangoLEARN to read from stdin, like standard UNIX tools, by specifying - instead of file name
2 participants