-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow fingerprinting of SAM files that only have a partial dictionary match to the haplotype map #1955
Allow fingerprinting of SAM files that only have a partial dictionary match to the haplotype map #1955
Conversation
…ting sam files that have a mismatching dictionary from the one stated in the haplotype map (it works for vcfs). This is because when fingerprinting sam files, the SamLocusIterator is given an IntervalList to query, and this IL has a mismatching dictionary which isn't supported. This PR resolves this by creating a new IL just before querying the Iterator using the proper dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yfarjoun! Just what I think should be one very quick request, otherwise looks good to me.
@Test(dataProvider = "bamFilesRGs") | ||
public void testCrossCheckRGs_with_short_dictionary(final File file1, final File file2, final boolean expectAllMatch, final int expectedRetVal, final int expectedNMetrics) throws IOException { | ||
|
||
File metrics = File.createTempFile("Fingerprinting", "NA1291.RG.crosscheck_metrics"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
if (file1.getName().endsWith(SamReader.Type.CRAM_TYPE.fileExtension())) { | ||
args.add("R=" + referenceForCrams); | ||
args.add("HAPLOTYPE_MAP=" + HAPLOTYPE_MAP_FOR_CRAMS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a HAPLOTYPE_MAP_SHORT_FOR_CRAMS as well just so we're testing that the functionality works for both bams and crams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
Hi @kachulis I'm not sure what the failure it about, but I'm pretty sure it's not due to my changes.... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the error is related to the new data provider. I think you've ended up duplicating some tests in a way I'm not sure you intended.
private Object[][] haplotypeMaps(){ | ||
return new Object[][]{{HAPLOTYPE_MAP}, {HAPLOTYPE_MAP_SHORT}}; | ||
} | ||
@DataProvider(name = "bamFilesRGs") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has gotten a little twisted around. You're testing the both full and short dictionary here, but only the full dictionary crams, and then you also have testCrossCheckRGs_with_short_dictionary
, where you basically do the same thing, expect hardcoded instead of via the data provider. I think that's also where the test failure is coming from, bamFilesRGs
now includes the haplotypeMaps, but testCrossCheckRGs_with_short_dictionary
doesn't include the haplotypeMaps as a parameter input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😊 huh. I saw other errors before. yes, something got strange.
OK. back to you again @kachulis . thanks for your patience. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, thanks @yfarjoun!
I've seems to have lost my merge privileges....be my guest! |
Despite efforts to the contrary, FingerprintChecker cannot fingerprinting sam files that have a mismatching dictionary from the one stated in the haplotype map (it works for vcfs). This is because when fingerprinting sam files, the SamLocusIterator is given an IntervalList to query, and this IL has a mismatching dictionary which isn't supported.
This PR resolves this by creating a new IL just before querying the Iterator using the proper dictionary.
Description
Give your PR a concise yet descriptive title
Please explain the changes you made here.
Explain the motivation for making this change. What existing problem does the pull request solve?
Mention any issues fixed, addressed or otherwise related to this pull request, including issue numbers or hard links for issues in other repos.
You can delete these instructions once you have written your PR description.
Checklist (never delete this)
Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.
Content
Review
For more detailed guidelines, see https://github.com/broadinstitute/picard/wiki/Guidelines-for-pull-requests