Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created VariantAnnotationArgumentCollection and StandardM2Annotation classes #3621

Merged
merged 1 commit into from
Sep 28, 2017

Conversation

jamesemery
Copy link
Collaborator

@jamesemery jamesemery commented Sep 27, 2017

Fixes #3618
Fixes #3292
Fixes #3123

@codecov-io
Copy link

codecov-io commented Sep 27, 2017

Codecov Report

Merging #3621 into master will increase coverage by 0.005%.
The diff coverage is 100%.

@@               Coverage Diff               @@
##              master     #3621       +/-   ##
===============================================
+ Coverage     79.767%   79.772%   +0.005%     
- Complexity     18233     18236        +3     
===============================================
  Files           1224      1225        +1     
  Lines          66966     66978       +12     
  Branches       10449     10449               
===============================================
+ Hits           53417     53430       +13     
  Misses          9326      9326               
+ Partials        4223      4222        -1
Impacted Files Coverage Δ Complexity Δ
...ecaller/AssemblyBasedCallerArgumentCollection.java 100% <ø> (ø) 1 <0> (ø) ⬇️
...ellbender/cmdline/StandardArgumentDefinitions.java 0% <ø> (ø) 0 <0> (ø) ⬇️
...hellbender/tools/walkers/mutect/Mutect2Engine.java 87.5% <100%> (ø) 42 <0> (ø) ⬇️
...bender/tools/walkers/annotator/OxoGReadCounts.java 95.652% <100%> (ø) 19 <1> (ø) ⬇️
...der/tools/walkers/mutect/M2ArgumentCollection.java 100% <100%> (ø) 1 <0> (ø) ⬇️
...stitute/hellbender/tools/HaplotypeCallerSpark.java 74.747% <100%> (ø) 23 <0> (ø) ⬇️
...er/tools/walkers/annotator/UniqueAltReadCount.java 78.571% <100%> (ø) 8 <1> (ø) ⬇️
...e/hellbender/tools/walkers/annotator/Coverage.java 87.5% <100%> (ø) 5 <1> (ø) ⬇️
...bender/tools/walkers/annotator/StrandArtifact.java 100% <100%> (ø) 28 <1> (ø) ⬇️
...otypecaller/HaplotypeCallerArgumentCollection.java 100% <100%> (ø) 2 <0> (ø) ⬇️
... and 12 more

Copy link
Contributor

@davidbenjamin davidbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this!

}

/**
* Which annotations to recompute for the combined output VCF file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why re-compute?

* to provide a pedigree file for a pedigree-based annotation) may cause the run to fail.
*/
@Advanced
@Argument(fullName="group", shortName="G", doc="One or more classes/groups of annotations to apply to variant calls", optional=true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "classes/groups" -> "groups"?

@@ -21,7 +21,7 @@
*
* Created by tsato on 4/19/17.
*/
public class StrandArtifact extends GenotypeAnnotation implements StandardSomaticAnnotation {
public class StrandArtifact extends GenotypeAnnotation implements StandardSomaticAnnotation, StandardM2Annotation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete StandardSomaticAnnotation entirely and absorb implementing classes into StandardM2Annotation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidbenjamin The only issue would seem to be that the UniqueAltReadCount appears to be a standard somatic annotation yet not included by default for M2, i'll add it to the standard M2 Annotations for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't do that @jamesemery -- we don't want to be changing the default Mutect annotations as part of this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of that, StandardSomaticAnnotation doesn't seem to be very useful. As far as I'm concerned you can delete it and leave UniqueAltReadCount without any extra annotation.

/**
* This is a marker interface used to indicate which annotations are "Standard" for Mutect2 only.
*/
public interface StandardM2Annotation extends Annotation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you call it StandardMutectAnnotation?

Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete, back to @jamesemery

import java.util.ArrayList;
import java.util.List;

public class VariantAnnotationArgumentCollection implements Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add class-level javadoc comment

public class VariantAnnotationArgumentCollection implements Serializable {
private static final long serialVersionUID = 1L;

public VariantAnnotationArgumentCollection(List<String> defaultGroups, List<String> defaultAnnotations, List<String> defaultExclude) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add javadoc for public constructor, including docs for the args.

private static final long serialVersionUID = 1L;

public VariantAnnotationArgumentCollection(List<String> defaultGroups, List<String> defaultAnnotations, List<String> defaultExclude) {
annotationGroupsToUse = new ArrayList<>(defaultGroups);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call Utils.nonNull() on each arg first.

* Which annotations to recompute for the combined output VCF file.
*/
@Advanced
@Argument(fullName="annotation", shortName="A", doc="One or more specific annotations to compute", optional=true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"One or more specific annotations to compute" -> "One or more specific annotations to add to the variant calls"

public List<String> annotationsToExclude = new ArrayList<>();

/**
* Which groups of annotations to add to the output VCF file. The single value 'none' removes the default group. See
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true that the value "none" removes the default group, or was this only true in GATK3.x?

public List<String> annotationsToExclude = new ArrayList<>();

/**
* Which groups of annotations to add to the output VCF file. The single value 'none' removes the default group. See
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, "Which groups of annotations to add to the output VCF file" -> "Which groups of annotations to add to the variant calls"

/**
* Which annotations to recompute for the combined output VCF file.
*/
@Advanced
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really be marked as @Advanced? It doesn't strike me as an advanced argument. What do you think @davidbenjamin ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this isn't advanced.

public class VariantAnnotationArgumentCollection implements Serializable {
private static final long serialVersionUID = 1L;

public VariantAnnotationArgumentCollection(List<String> defaultGroups, List<String> defaultAnnotations, List<String> defaultExclude) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaultExclude -> defaultAnnotationsToExclude (otherwise could be misinterpreted)

@@ -93,6 +94,14 @@ private VariantOverlapAnnotator initializeOverlapAnnotator(final FeatureInput<Va

return new VariantOverlapAnnotator(dbSNPInput, overlaps);
}
public static VariantAnnotatorEngine ofSelectedMinusExcluded(final VariantAnnotationArgumentCollection argumentColleciton,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add javadoc for this public method, and move it next to the other overload of ofSelectedMinusExcluded() in the source file.


/**
* Which groups of annotations to add to the output VCF file. The single value 'none' removes the default group.
* Note that this usage is not recommended because it obscures the specific requirements of individual annotations.
* Set of annotation arguments to use.
* Any requirements that are not met, e.g. failing to provide a pedigree file for a pedigree-based annotation, may cause the run to fail.
* For somatic analyses, the StandardSomaticAnnotation group currently contains one annotations: StrandArtifact.
* Note the latter is redundant to an annotation given by the --annotation argument default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove mention of StandardSomaticAnnotation, if it's going to be deleted.

@droazen droazen assigned jamesemery and unassigned droazen and davidbenjamin Sep 27, 2017
Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second-pass review complete, back to @jamesemery

@@ -33,6 +33,9 @@ private StandardArgumentDefinitions(){}
public static final String ADD_OUTPUT_SAM_PROGRAM_RECORD = "addOutputSAMProgramRecord";
public static final String ADD_OUTPUT_VCF_COMMANDLINE = "addOutputVCFCommandLine";
public static final String SEQUENCE_DICTIONARY_NAME = "sequenceDictionary";
public static final String ANNOTATION_LONG_NAME = "annotation";
public static final String ANNOTATION_GROUP_LONG_NAME = "group";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should really be named annotationGroup, regardless of what GATK3.x had (though check the WDLs for usage of the old name)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to be in any WDLs here or in dsde-pipelines


/**
* Arguments for requesting VariantContext annotations to be processed by {@link org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorEngine}
* for tools that process VariantContext objects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for tools that process variants"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/**
* Argument collection constructor which defines tool default annotation arguments when they are not overridden by the user
*
* @param defaultGroups List annotation groups names to be used by default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"annotation groups names" -> "annotation group names"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*
* @param defaultGroups List annotation groups names to be used by default
* @param defaultAnnotations List of annotation class names to be used by default
* @param defaultAnnotationsToExclude List of default annotation classes to exclude from output, these override the previous two fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"List of annotation class names to exclude by default. These override the default annotations and annotation groups."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/**
* Which annotations to include in variant calls in the output.
*/
@Argument(fullName=StandardArgumentDefinitions.ANNOTATION_LONG_NAME, shortName=StandardArgumentDefinitions.ANNOTATION_SHORT_NAME, doc="One or more specific annotations add to variant calls", optional=true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"specific annotations add" -> "specific annotations to add"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public List<String> annotationsToExclude = new ArrayList<>();

/**
* Which groups of annotations to add to the output VCF file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"output VCF file" -> "output variant calls"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

/**
* Which annotations to include in variant calls in the output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add note that these supplement the annotations in the annotation groups specified via -G

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*/
public interface StandardSomaticAnnotation {
public interface StandardMutectAnnotations extends Annotation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to StandardMutectAnnotation (without the plural), for grammatical consistency with the other annotation groups.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* The annotation engine will mark variants overlapping anything in those sets using the name given by {@link FeatureInput#getName()}.
* Note: the DBSNP FeatureInput should be passed in separately, and not as part of this List - an GATKException will be thrown otherwise.
* Note: there are no non-DBSNP comparison FeatureInputs an empty List should be passed in here, rather than null.
* @return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@return a VariantAnnotatorEngine initialized with the requested annotations

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* Any requirements that are not met, e.g. failing to provide a pedigree file for a pedigree-based annotation, may cause the run to fail.
* For somatic analyses, the StandardSomaticAnnotation group currently contains one annotations: StrandArtifact.
* Note the latter is redundant to an annotation given by the --annotation argument default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment no longer makes sense: "Note the latter is redundant to an annotation given by the --annotation argument default"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@droazen
Copy link
Contributor

droazen commented Sep 27, 2017

Note that this PR also closes #3123

@jamesemery jamesemery force-pushed the je_annotationArgumentCollection branch from 671f734 to 999a1a7 Compare September 27, 2017 18:58
Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 latest version looks good, merge once tests pass

Copy link
Contributor

@magicDGS magicDGS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single comment for make the framework simpler to use in downstream projects...

* Arguments for requesting VariantContext annotations to be processed by {@link org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorEngine}
* for tools that process variants objects.
*/
public class VariantAnnotationArgumentCollection implements Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, consider to make this an abstract class to being able to hide some parameters from the command line. Then, a DefaultGATKVariantAnnotationArgumentCollection should be used in GATK with this implementation.

If the variant anotation engine is converted into a plugin, that will be useful for downstream projects: I was actually working on my own plugin, but I prefer to work with standards from the GATK framework, and if there are downstream projects that want to annotate variants with their own annotations, this will be required for example if they do not need groups to annotate together.

As an example, that will be similar to what I did for the GATKReadFilterPluginDescriptor in #2355. As an API user, I think that it is good to have all the arguments for the plugins as abstract classes to being able to hide some from the CLI...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magicDGS This would be a useful feature but it sounds like the benefit is mostly hidden behind pending work on converting annotations into a plugin, which we have planned for the 4.0 release. This change would probably be incorporated as part of that branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that this could be addressed here for using the VariantAnnotationArgumentCollection with the VariantAnnotator too, even if the plugin is not there. I cannot hide some parameters with the current implementation, to being able to pass the argument collection to the annotator...

Anyway, I can submit a patch if there is no planned timeplan for the plugin, to being able to use something similar in my code...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be easier to just add a generic/universal way to hide arguments in barclay, instead of having to complicate all of our argument collections with added abstract classes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the main reason of having argument collections is for tweak the arguments for
make optional/required some arguments (such as ReferenceInputArgumentCollection) or have the ability to tweak some arguments to be hidden/removed from the user.

I believe that with the current system for parsing arguments and argument collections is going to be complicated to hide arguments. In addition, sometimes the requirement isn't hide the argument but provide a tool-specific documentation (e.g., the ReadFilter plugin specifies that are filters before the analysis; in some cases, I have tools which perform the filtering after processing, and it is important for the user to know that difference).

I think that having interfaces for a framework is the best way to encourage users with specific needs to use it, unless you really want to forbid the usage/tweak of some components. I see several use cases for having the abstract argument collections....

@sooheelee
Copy link
Contributor

Thanks for this.

@davidbenjamin davidbenjamin deleted the je_annotationArgumentCollection branch February 24, 2021 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants