Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete old qual #6099

Merged
merged 12 commits into from
Aug 29, 2019
Merged

Delete old qual #6099

merged 12 commits into from
Aug 29, 2019

Conversation

davidbenjamin
Copy link
Contributor

@davidbenjamin davidbenjamin commented Aug 19, 2019

My fifth Broadiversary present. Note that a few classes involving AF priors remain as they are used in the active region determination reference confidence model. I hope to get rid of those uses later.

Closes #2255.

@droazen Would someone from engine like to do the honors?

Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of minor comments and questions. Should be simple to resolve! Goodbye to all this cruft!

/**
* Use the old GATK 3 qual score aka the "exact model"
*/
@Argument(fullName = "use-old-qual-calculator", shortName = "old-qual", doc = "Use the old AF model", optional = true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could leave this as a deprecated argument that causes a clear exception, do we do that anymore?

protected final AFCalculator newAFCalculator;

protected final AFCalculatorProvider afCalculatorProvider;
protected final AlleleFrequencyCalculator newAFCalculator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this just afCalculator now? It's not longer new vs old.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

final AFCalculator afCalculator = configuration.genotypeArgs.useOldAFCalculator ?
afCalculatorProvider.getInstance(vc,defaultPloidy,maxAltAlleles) : newAFCalculator;
final AFCalculationResult AFresult = afCalculator.getLog10PNonRef(reducedVC, defaultPloidy, maxAltAlleles, getAlleleFrequencyPriors(vc,defaultPloidy,model));
final AlleleFrequencyCalculator afCalculator = newAFCalculator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is redundant now just use newAFCalculator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

final boolean doAlleleSpecificCalcs) {
this.configuration = Utils.nonNull(configuration, "the configuration cannot be null");
this.samples = Utils.nonNull(samples, "the sample list cannot be null");
this.afCalculatorProvider = Utils.nonNull(afCalculatorProvider, "the AF calculator provider cannot be null");
this.doAlleleSpecificCalcs = doAlleleSpecificCalcs;
logger = LogManager.getLogger(getClass());
numberOfGenomes = this.samples.numberOfSamples() * configuration.genotypeArgs.samplePloidy;
log10AlleleFrequencyPriorsSNPs = composeAlleleFrequencyPriorProvider(numberOfGenomes,
configuration.genotypeArgs.snpHeterozygosity, configuration.genotypeArgs.inputPrior);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method computeAlleleFrequencyPriors on line 86 is now not used. Should it be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

this.doAlleleSpecificCalcs = doAlleleSpecificCalcs;
logger = LogManager.getLogger(getClass());
numberOfGenomes = this.samples.numberOfSamples() * configuration.genotypeArgs.samplePloidy;
log10AlleleFrequencyPriorsSNPs = composeAlleleFrequencyPriorProvider(numberOfGenomes,
configuration.genotypeArgs.snpHeterozygosity, configuration.genotypeArgs.inputPrior);
log10AlleleFrequencyPriorsIndels = composeAlleleFrequencyPriorProvider(numberOfGenomes,
configuration.genotypeArgs.indelHeterozygosity, configuration.genotypeArgs.inputPrior);

final double refPseudocount = configuration.genotypeArgs.snpHeterozygosity / Math.pow(configuration.genotypeArgs.heterozygosityStandardDeviation,2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to move these calculations into the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public double getLog10PriorOfAFGT0() {
return log10PriorsOfAC[AF1p];
public double getLog10PosteriorOfVariant() {
return MathUtils.log10OneMinusPow10(log10PosteriorOfNoVariant);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this one have to be transformed? The variable it's stored in has the name of the method which makes me think it represents what the method is returning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stored quantity is the log probability of no variant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, I looked at that repeatedly and couldn't see it. I wonder if we can make these method names more visually distinct.

Maybe something like:
log10PosteriorOfVariant vs
log10PosteriorThatThereisNoVariant

Something to consider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This is so much clearer now.

}
//TODO: this should be a class of static methods once the old AFCalculator is gone.
//TODO: this should be a class of static methods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine for this to be a non-static method because it has so many parameter inputs, storing them together in an object makes some sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

// This is the case when doing GVCF output.
// TODO: now that old qual is gone, do we still need this?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a ticket to track this question?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


import java.util.*;

public final class AFCalculationResultUnitTest extends GATKBaseTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unit test was deleted but the class remains, was that an accident?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class is a shell of its former self, but I have restored a small unit test for the remaining functionality.

@@ -304,7 +304,7 @@ public void testSpanningDeletionWithVeryUnlikelyAltAllele() {
private static Genotype genotypeWithObviousCall(final int ploidy, final int numAlleles, final int[] alleles, final int PL) {
return makeGenotype(ploidy, PLsForObviousCall(ploidy, numAlleles, alleles, PL));
}
//note the call is irrelevant to the AFCalculator, which only looks at PLs
//note the call is irrelevant to the AlleleFrequencyCalculator, which only looks at PLs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like this could be removed now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately you need to set these placeholders alleles in order to make() the GenotypeBuilder without an error.

@davidbenjamin
Copy link
Contributor Author

Back to @lbergelson

* @param expected expected array.
* @param tolerance maximum difference between double value to be consider equivalent.
*/
protected static void assertEqualsDoubleArray(final double[] actual, final double[] expected, final double tolerance) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm a fan of replacing this with a call to a TestNG method marked internal that doesn't seem like a good idea...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the old method.

* @param actual actual produced array.
* @param expected expected array.
*/
protected static void assertEqualsLongArray(final long[] actual, final long[] expected) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be redundant with testngs Assert.assertEqual now. I think it does reflection and checks if the objects are arrays and then uses Arrays.equals which is an element by element comparison.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (and yes, it does reflection)

@davidbenjamin
Copy link
Contributor Author

Okay, @lbergelson, I redid the second-to-last commit.

Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidbenjamin davidbenjamin merged commit 0f9f925 into master Aug 29, 2019
@davidbenjamin davidbenjamin deleted the db_delete_old_qual branch August 29, 2019 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Eliminate the old ExactAFCalculator QUAL score
3 participants