Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastqToSam stdin fix for #915 #1910

Merged
merged 5 commits into from
Aug 24, 2023

Conversation

delocalizer
Copy link
Contributor

@delocalizer delocalizer commented Aug 21, 2023

Description

Allow FastqToSam to take input from stdin or a named pipe.

  • If input is not a regular file, don't autodetect the quality format, and use the same opened reader throughout.
  • If input is not a regular file and QUALITY_FORMAT is not specified, that's an argument validation error.

Addresses #915


Checklist (never delete this)

Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.

Content

  • Added or modified tests to cover changes and any new functionality
  • Edited the README / documentation (if applicable)
  • All tests passing on github actions

Review

  • Final thumbs-up from reviewer
  • Rebase, squash and reword as applicable

For more detailed guidelines, see https://github.com/broadinstitute/picard/wiki/Guidelines-for-pull-requests

@delocalizer delocalizer changed the title Fastq to sam stdin FastqToSam stdin Aug 21, 2023
@delocalizer delocalizer changed the title FastqToSam stdin FastqToSam stdin fix for #915 Aug 21, 2023
Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delocalizer. Thank you. This seems like a good idea. Is working on single ended reads sufficient for your use case? Usually I'd expect paired reads. This could be adapted so that it works on a pair of named pipes or with process substitution if you also handle FASTQ2 the same way.

I have a few minor comments about error handling and formatting stuff.

src/main/java/picard/sam/FastqToSam.java Outdated Show resolved Hide resolved
src/main/java/picard/sam/FastqToSam.java Outdated Show resolved Hide resolved
src/main/java/picard/sam/FastqToSam.java Show resolved Hide resolved
src/test/java/picard/sam/FastqToSamTest.java Outdated Show resolved Hide resolved
src/test/java/picard/sam/FastqToSamTest.java Show resolved Hide resolved
src/test/java/picard/sam/FastqToSamTest.java Outdated Show resolved Hide resolved
src/test/java/picard/sam/FastqToSamTest.java Outdated Show resolved Hide resolved
@delocalizer delocalizer requested a review from lbergelson August 23, 2023 01:36
Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delocalizer A very long rambling comment about PicardPath which should have an easy solution. Looks good after that I think. Thank you for doing this.

src/main/java/picard/sam/FastqToSam.java Show resolved Hide resolved
@@ -240,6 +244,8 @@ public class FastqToSam extends CommandLineProgram {

private static final SolexaQualityConverter solexaQualityConverter = SolexaQualityConverter.getSingleton();

private Boolean regularFileInput;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment explaining that this is tested/set in customCommandlineValidation. That's exactly what I asked you to do but it's a bit of an unexpected place to do initialization so it might be worth mentioning.

*/
public boolean isOther() {
try {
return Files.readAttributes(IOUtil.getPath(super.getURIString()), BasicFileAttributes.class).isOther();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea but you've stumbled into a new part of our API's which is maybe underexplained.

A better implementation of this in PicardHtsPath would look like this:
(I've added some annotation to explain why)

    public boolean isOther() {
        // one of the non-obvious bits of IOPath is that it can represent inputs that don't represent a file-like object.  An example would be something like an HtsGet api URL which provides reads or variants but isn't an addressable block of data.
        // So first thing to do is check if it can be represented as a path
       if(isPath()) {. 
            try {
                return Files.readAttributes(toPath(), BasicFileAttributes.class).isOther();
                 // use the built in toPath() in HtsPath instead of string manipulation.  There's a lot of gotchas with converting strings <-> uri
            } catch (IOException e) {
                throw new RuntimeIOException(e);
            }
        } else {
            return true; // if it's not a path it's definitely not a regular file
        }
    }

That said, I think this is probably best to make this a utility function that accepts IOPath instead of a method. That way it would be compatible with functions that use generic IOPath rather than the more specific PicardHtsPath. (GATK for instance tends to call picard methods using a different implementation of IOPath)

This should work:

      public static boolean isOther(final IOPath ioPath) {
        if(ioPath.isPath()) {
            try {
                return Files.readAttributes(ioPath.toPath(), BasicFileAttributes.class).isOther();
            } catch (IOException e) {
                throw new RuntimeIOException(e);
            }
        } else {
            return true;
        }
    }

We don't have a good place to stick path utilities in picard right, having it as a static method on picard path seems like the best place for it right now.

Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me. Thank you.

@lbergelson lbergelson merged commit a9194bd into broadinstitute:master Aug 24, 2023
@delocalizer
Copy link
Contributor Author

Thanks @lbergelson for spending the time on review and the excellent educational feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants