Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(vep_parser): store consequence to impact score as a project config #811

Merged
merged 9 commits into from
Oct 3, 2024

Conversation

ireneisdoomed
Copy link
Contributor

@ireneisdoomed ireneisdoomed commented Oct 3, 2024

✨ Context

Removes the src/gentropy/assets/data/variant_consequence_to_score.tsv file. This mapping between a variant conseauence and the impact score is now a config parameter of VariantIndexStep.
This feature was implemented in #805. Because I didn't know the file was a dependency of the VEP parser, this PR isolates the feature. The VEP parser now reads this property directly from the dataset.
I have run the variant index generation from all the VEP outputs produced in the last run and the mapping works properly.

variant_index.df.filter(f.size("transcriptConsequences") == 0).show()
+---------+----------+--------+---------------+---------------+------------------+-----------------------+------+----------------------+-----+-----------------+-------+
|variantId|chromosome|position|referenceAllele|alternateAllele|inSilicoPredictors|mostSevereConsequenceId|hgvsId|transcriptConsequences|rsIds|alleleFrequencies|dbXrefs|
+---------+----------+--------+---------------+---------------+------------------+-----------------------+------+----------------------+-----+-----------------+-------+
+---------+----------+--------+---------------+---------------+------------------+-----------------------+------+----------------------+-----+-----------------+-------+

🛠 What does this PR implement

  • Deletes ``src/gentropy/assets/data/variant_consequence_to_score.tsv`
  • Implements VariantIndexStep.consequence_to_pathogenicity_score
  • Deletes VariantIndex.get_most_severe_gene_consequence, as this is already accessible in the VariantIndex dataset directly (transcriptConsequences.consequenceScore)

🙈 Missing

🚦 Before submitting

  • Do these changes cover one single feature (one change at a time)?
  • Did you read the contributor guideline?
  • Did you make sure to update the documentation with your changes?
  • Did you make sure there is no commented out code in this PR?
  • Did you follow conventional commits standards in PR title and commit messages?
  • Did you make sure the branch is up-to-date with the dev branch?
  • Did you write any new necessary tests?
  • Did you make sure the changes pass local tests (make test)?
  • Did you make sure the changes pass pre-commit rules (e.g poetry run pre-commit run --all-files)?

@github-actions github-actions bot added the Step label Oct 3, 2024
@ireneisdoomed ireneisdoomed changed the title refactor(vep_parser): store consequence to impact score as a class attribute refactor(vep_parser): store consequence to impact score as a project config Oct 3, 2024
Copy link
Contributor

@DSuveges DSuveges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adjusting the code!

@DSuveges DSuveges merged commit c286c3b into dev Oct 3, 2024
5 checks passed
@DSuveges DSuveges deleted the il-csq_to_score_property branch October 3, 2024 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants