Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make VariantContextBuilder safer #1344

Merged
merged 13 commits into from
Jul 16, 2019
Merged

Conversation

yfarjoun
Copy link
Contributor

@yfarjoun yfarjoun commented Apr 3, 2019

Adds tests and fixes unsafe building patterns that were enabled by VariantContextBuilder.

This required a small change in the API around adding a genotype context. Now when .genotypes() is called with a GenotypeContext object the builder will call immutable() on that object, to make sure that the user doesn't change the GC prior to the creation of the VariantContext...If folks really need to reuse the GenotypeContext objects across multiple build calls, we could add a .genotypesKeepMutable() method with a scary javaDoc note.....

Another change in the API (somewhat) is that validation was added for the filters as per the hts-spec.

fixes #1365

@yfarjoun
Copy link
Contributor Author

yfarjoun commented Apr 3, 2019

Several of these use-patterns fail (as you can see in the test results...) my question is..are they reasonable use-patterns and should we fix them?

@yfarjoun yfarjoun marked this pull request as ready for review April 5, 2019 20:51
@yfarjoun yfarjoun changed the title Tests that show that VariantContextBuilder is unsafe make VariantContextBuilder safer Apr 6, 2019
@codecov-io
Copy link

codecov-io commented Apr 6, 2019

Codecov Report

Merging #1344 into master will increase coverage by 1.872%.
The diff coverage is 83.74%.

@@              Coverage Diff               @@
##             master     #1344       +/-   ##
==============================================
+ Coverage     67.86%   69.732%   +1.872%     
- Complexity     8284      9733     +1449     
==============================================
  Files           563       573       +10     
  Lines         33709     38324     +4615     
  Branches       5657      6997     +1340     
==============================================
+ Hits          22875     26724     +3849     
- Misses         8657      9212      +555     
- Partials       2177      2388      +211
Impacted Files Coverage Δ Complexity Δ
...tsjdk/variant/variantcontext/GenotypesContext.java 84.302% <100%> (ø) 67 <0> (ø) ⬇️
...ain/java/htsjdk/variant/variantcontext/Allele.java 77.703% <100%> (ø) 95 <0> (ø) ⬇️
...java/htsjdk/variant/variantcontext/CommonInfo.java 53.939% <50%> (+6.076%) 68 <3> (+21) ⬆️
.../htsjdk/variant/variantcontext/VariantContext.java 80.769% <80.822%> (+3.055%) 432 <16> (+186) ⬆️
.../variant/variantcontext/VariantContextBuilder.java 82.53% <86.486%> (+10.308%) 39 <13> (+4) ⬆️
...s/cram/encoding/external/ExternalLongEncoding.java 61.538% <0%> (-5.128%) 4% <0%> (+1%)
.../main/java/htsjdk/samtools/SAMValidationError.java 93.805% <0%> (-2.311%) 11% <0%> (+2%)
src/main/java/htsjdk/samtools/BAMSBIIndexer.java 73.333% <0%> (-1.667%) 1% <0%> (ø)
src/main/java/htsjdk/variant/bcf2/BCFVersion.java 93.103% <0%> (-1.633%) 12% <0%> (+5%)
src/main/java/htsjdk/samtools/SAMFileHeader.java 66.942% <0%> (-0.5%) 83% <0%> (+38%)
... and 78 more

@yfarjoun
Copy link
Contributor Author

yfarjoun commented Apr 7, 2019

@d-cameron this fixes the issue you reported 4 years ago: #255

Care to take a look?

import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using wildcard imports makes it harder to figure out where a name came from when reading the code outside of an IDE. Changing an IDE setting can enforce the rule to keep imports, instead of replacing them with a wildcard.

@@ -342,23 +356,52 @@ private void makeAttributesModifiable() {
* @return this builder
*/
public VariantContextBuilder filters(final Set<String> filters) {
this.filters = filters;
makeFiltersModifiable();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better approach would be to pass filters to makeFiltersModifiable()? It seems like you've got similar logic split across different methods here.

/**
* {@link #filters}
*
* @param filters Set of strings to set as the filters for this builder
* @return this builder
*/
public VariantContextBuilder filters(final String ... filters) {
filters(new LinkedHashSet<String>(Arrays.asList(filters)));
filtersAsIs(new LinkedHashSet<>(Arrays.asList(filters)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different than filters(new LinkedHashSet<>(Arrays.asList(filters)))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't bother call make filters modifyable.

@yfarjoun
Copy link
Contributor Author

back to you @pshapiro4broad

@yfarjoun
Copy link
Contributor Author

@pshapiro4broad are you OK with this PR?

@yfarjoun
Copy link
Contributor Author

yfarjoun commented Jun 3, 2019

@pshapiro4broad 👀 please?

Copy link
Contributor

@pshapiro4broad pshapiro4broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another set of comments, and this time I found a bug.

@@ -1296,13 +1294,15 @@ private boolean validate(final EnumSet<Validation> validationToPerform) {
switch (val) {
case ALLELES: validateAlleles(); break;
case GENOTYPES: validateGenotypes(); break;
case FILTERS: validateFilters(); break;
default: throw new IllegalArgumentException("Unexpected validation mode " + val);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could avoid this switch by moving the code into the Validation enum. E.g.

    public enum Validation {
        ALLELES() {
            void validate(VariantContext variantContext) {
                variantContext.validateAlleles();
            }
        },
        GENOTYPES() {
            void validate(VariantContext variantContext) {
                variantContext.validateGenotypes();
            }
        },
        FILTERS {
            void validate(VariantContext variantContext) {
                variantContext.validateFilters();
            }
        };

        abstract void validate(VariantContext variantContext);
    }

then this code becomes

        validationToPerform.forEach(val -> val.validate(this));

It looks like you could also move the validation code into the enum, since they only look at data that's available via getters.


final Map<String, Object> tempAttributes = attributes;
this.attributes = new HashMap<>();
if (tempAttributes != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do this check earlier to avoid allocating a map that's not used, e.g.

if (tempAttributes != null) {
    this.attributes = new HashMap<>(tempAttributes);
} else {
    this.attributes = new HashMap<>();
}

if (filters == null) {
unfiltered();
} else {
makeFiltersModifiable();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these lines

            makeFiltersModifiable();
            this.filters.clear();
            this.filters.addAll(filters);
            toValidate.add(VariantContext.Validation.FILTERS);

can be replaced with

this.filtersCanBeModified = true;
filtersAsIs(new HashSet<>(filters));

There's a lot of overlapping methods so it's hard for me to keep clear exactly what method does what thing. On the other hand, maybe it's OK to merge it up without cleaning it up too much...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -393,6 +452,7 @@ public VariantContextBuilder unfiltered() {
public VariantContextBuilder genotypes(final GenotypesContext genotypes) {
this.genotypes = genotypes;
if ( genotypes != null )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a bug! Introduced by the lack of { }!

Please add some { }s.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it wouldn't be that bad to validate a null genotypeContext...but sure, not as intended.

@@ -574,7 +634,10 @@ public VariantContext make() {
}

public VariantContext make(final boolean leaveModifyableAsIs) {
if(!leaveModifyableAsIs) attributesCanBeModified = false;
if(!leaveModifyableAsIs){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add whitespace please

Suggested change
if(!leaveModifyableAsIs){
if (!leaveModifyableAsIs) {

@yfarjoun
Copy link
Contributor Author

yfarjoun commented Jun 4, 2019

Thanks again @pshapiro4broad back to you.

Copy link
Contributor

@pshapiro4broad pshapiro4broad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two last things, thanks for your patience!

default: throw new IllegalArgumentException("Unexpected validation mode " + val);
}
}
validationToPerform.forEach(v->v.validate(this));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace

Suggested change
validationToPerform.forEach(v->v.validate(this));
validationToPerform.forEach(v -> v.validate(this));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also validation(s)ToPerform would make more sense

final Genotype genotype = genotypes.get(i);
if (genotype.isAvailable()) {
final List<Allele> alleles = genotype.getAlleles();
for (int j = 0, size = alleles.size(); j < size; j++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm reading this code, due to the move, I have a comment. It looks like this can be replaced with a foreach loop, e.g.

for (Allele gAllele : genotype.getAlleles()) {
    ...
}

@@ -114,7 +114,7 @@ public boolean isNotFiltered() {

public void addFilter(String filter) {
if ( filters == null ) // immutable -> mutable
filters = new HashSet<String>();
filters = new HashSet<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{}

Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yfarjoun I have a bunch of stupid nitpicks and a real comment. 👍 after that

}
}

static private void validateFilters(final VariantContext variantContext) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static private void validateFilters(final VariantContext variantContext) {
private static void validateFilters(final VariantContext variantContext) {

abstract void validate(VariantContext variantContext);


static private void validateAlleles(final VariantContext vc) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static private void validateAlleles(final VariantContext vc) {
private static void validateAlleles(final VariantContext vc) {

}
}

static private void validateGenotypes(final VariantContext variantContext) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static private void validateGenotypes(final VariantContext variantContext) {
private static void validateGenotypes(final VariantContext variantContext) {

@@ -260,6 +270,8 @@
* Determine which genotype fields are in use in the genotypes in VC
* @return an ordered list of genotype fields in use in VC. If vc has genotypes this will always include GT first
*/


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird extra linebreaks.

default: throw new IllegalArgumentException("Unexpected validation mode " + val);
}
}
validationToPerform.forEach(v->v.validate(this));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also validation(s)ToPerform would make more sense

}

@DataProvider
public Object[][] BuilderSchemes() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public Object[][] BuilderSchemes() {
public Object[][] builderSchemes() {

}

@Test
public void testFilterCanUseUnmodifyableSet() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public void testFilterCanUseUnmodifyableSet() {
public void testFilterCanUseUnmodifiableSet() {

};
}

@Test(dataProvider = "illegalFilterStrings",expectedExceptions = IllegalStateException.class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Test(dataProvider = "illegalFilterStrings",expectedExceptions = IllegalStateException.class)
@Test(dataProvider = "illegalFilterStrings", expectedExceptions = IllegalStateException.class)

final VariantContext result2 = root2.attribute("AC", 1).make();

Assert.assertEquals(result1.getAttribute("AC"), result2.getAttribute("AC"));
Assert.assertEquals(result1.getAttribute("AC"), result2.getAttribute("AC"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the potential issues around the builder is that there can be subtle bugs that relate to the order of instantiation vs alteration.

with this initialization:
builder1.attribute("AC", 1)
builder2 = new Builder(builder1)

these two following operations could be different:

case1: 
result1 = builder1.make()
builder2.attribute("AC", 2)
result2= builder2.make()

case2:
builder2.attribute("AC", 2)
result1 = builder1.make()
result2= builder2.make()

That could be the case if the builder has an internal state that is accidentally shared/modified but the make() operation performs a copy. In both cases I would expect result1 to have ac of 1 but it's possible that case 1 would be fine but case 2 would be broken. I don't see a test for that type of difference here. I think in practice the only place that make() copies anything is with the alleles, so this may be unnecessary / overkill to tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case 1 is tested already, I believe, but I'll add it explicitly, also case 2 works, but I added a test.

try {
gc.add(sample2);
} catch (IllegalAccessError e) {
// nice work...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I didn't realize that this throws IllegalAccessError. That's a subclass of LinkageError. We should absolutely not be throwing or catching that here.

Could you change it to IllegalStateException instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no.. the issue is that the sample map is immutable, so when I try to add a sample to it I get this exception (as I should!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. I see what you mean. sure.

@yfarjoun yfarjoun self-assigned this Jun 17, 2019
@yfarjoun yfarjoun merged commit 1d4a316 into master Jul 16, 2019
@yfarjoun yfarjoun deleted the yf_make_variantContextBuilder_safer branch July 16, 2019 15:06
lbergelson added a commit that referenced this pull request Aug 2, 2019
* Fix regression in VariantContextBuilder introduced in #1344 
* Fixes #1404 
* filter validation now fails with IllegalStateException instead of of NPE
* Treat null filter array the same as null filter sets


Co-authored-by: Yossi Farjoun <farjoun@broadinstitute.org>
Co-authored-by: Louis Bergelson <louisb@broadinstitute.org>
@yfarjoun yfarjoun restored the yf_make_variantContextBuilder_safer branch August 11, 2020 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VariantContextBuilder : putAttributes fails if Map cannot be modified
4 participants