-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept piped input FastqToSam #915
Comments
In some cases, special built tools were built to allow streaming (ex. |
The most basic would be letting I don't see how any of the |
MarkDuplicates is not one pass... sorry. Also, a lot of tools allow piped SAM. I'd be excited for some more benchmarks on the implementation you propose. Likely folks are busy with their own work (i.e. Likely They don't get paid to support Picard)so if your up for a PR, I am sure I am not out of line to suggest folks would be willing to review. But that's better left to the Broad Picard folks. |
👍 @nh13.
if there's truly no performance hit, and it's a small localized change I'll
be happy to review a PR. As @nh13 said, MarkDuplicates is not single pass,
so it will not stream, but FastqToSam is a reasonable one. My concern woudl
be that the change need to be in htsjdk where they will not be
localized...but I'm happy to be proven wrong!
Care to submit a PR? or at least a skeleton of a solution?
…On Wed, Aug 30, 2017 at 8:39 AM, Nils Homer ***@***.***> wrote:
MarkDuplicates is not one pass... sorry. Also, a lot of tools allow piped
SAM. I'd be excited for some more benchmarks on the implementation you
propose. Likely folks are busy with their own work (i.e. Likely They don't
get paid to support Picard)so if your up for a PR, I am sure I am not out
of line to suggest folks would be willing to review. But that's better left
to the Broad Picard folks.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#915 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACnk0vks4gf0qztZbU2o26ouM6QNO6nJks5sdPWhgaJpZM4PGXkV>
.
|
It turns out that most tools accepting sam inputs do work on piped input (which all boils down to checking |
In that case you could add a command line argument to forgo the check.
…On Tue, 5 Sep 2017 at 7.51, Simon Ye ***@***.***> wrote:
It turns out that most tools accepting sam inputs do work on piped input
(which all boils down to checking InputStream.markSupported() on the
input). In that case this limits this to just FastqToSam. My thoughts are
to not do quality score sanity-checking if the QUALITY_FORMAT is provided
and the input is a pipe.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#915 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACnk0uoJgXRU4yeWS8ehcOFS4Y49Jz-Pks5sfNNXgaJpZM4PGXkV>
.
|
Feature request
Tool(s) involved
SAM/BAM, Metrics etc.
Description
Allow Picard to accept piped input for commands working with large files
Currently this is not possible for most tools operating on SAM/BAM input because of file type autodetection by reading the beginning of the file and then
.reset()
on theInputStream
, which cannot be done on/dev/stdin
or named pipe input. SimilarlyFastqToSam
does autodetection of quality score format and the reopens the stream. However, by using something likeRereadableInputStream
https://tika.apache.org/1.2/api/org/apache/tika/utils/RereadableInputStream.html, picard can support input pipes combined with file autodetection.The text was updated successfully, but these errors were encountered: