-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tx_filter_variants_by_csqs
to correctly handle the ignore_splicing
parameter
#727
Conversation
…rest in CSQ_ORDER
…uration so other test can pull from the correct bucket
…move splice consequences from consequence_terms instead of only removing them if they are the worst consequence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple questions
vep_root: ht[vep_root].annotate( | ||
transcript_consequences=ht[vep_root] | ||
.transcript_consequences.map( | ||
lambda csq: csq.annotate( | ||
consequence_terms=csq.consequence_terms.filter( | ||
lambda x: ~hl.literal(CSQ_SPLICE).contains(x) | ||
) | ||
) | ||
) | ||
.filter(lambda csq: hl.len(csq.consequence_terms) > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this code should actually be added to the VEP utils? my initial thought was to put it here, but it doesn't quite fit with that function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, does this mean variants that had a splicing consequence as their worst consequence were also removed from the previous pext results? I found this line in the original repo https://github.com/macarthur-lab/tx_annotation/blob/master/tx_annotation.py#L305 (but didn't dig thoroughly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually started to add it to the vep utils, but then decided against it because that was clearly filtering based on most severe consequence, and I thought it might be confusing to have both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix is to make it do what was done in the original. The line of code you reference picks the most severe consequence after removing the splice consequences from the ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I keep going back and forth on whether this (filtering on any csq) is helpful to have, but maybe you're right that it might be too confusing
Co-authored-by: Katherine Chao <kchao@broadinstitute.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
As is, the
ignore_splicing
parameter will remove transcripts from transcript_consequences where the worst consequence in "consequence_terms" is a splicing consequence. However, we want to keep the transcript if it has another consequence in "consequence_terms" that is not a splicing consequence. This PR updates the function so that splicing consequences are removed from the "consequence_terms" before the most severe consequence is added.Example:
Currently this transcript would get removed, but we want it to be kept as: