Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent disease queries for the hiPhive and OMIM prioritiser #337

Closed
damiansm opened this issue May 22, 2019 · 9 comments
Closed

Consistent disease queries for the hiPhive and OMIM prioritiser #337

damiansm opened this issue May 22, 2019 · 9 comments

Comments

@damiansm
Copy link
Contributor

Prior to 12.0.0 we took all disease-gene associations in the diseases table regardless of association type from OMIM or Orphanet.

For the 1902_* dbs we cleaned up some of the Orphanet processing to fill in the association type consistently.

For 12.0.0 we modified the OMIMPrioritiser query to only use association type IN (disease and CNV) but for some reason left the hiPhive prioritiser selecting all which leads to some oddities i.e. you get a hiPhive to a disease but is not listed in the HTML of known diseases and the OMIMPrioritiser has non-intuitive behaviour e.g. score gets halved for a correct AD hit as the AD disease is type=?.

We have benchmarked on GeL cases and some HGMD simulated cases and found changing both to type=D,C,? improved precision but reduced recall in the top 5. Adding back in the S (susceptibility) type associations to both improved both recall and precision relative to 12.0.0 as Gel has a number of familial breast cancer diagnoses involving ATM etc.

Conclusion: push out a point release with type IN (D,C,?,S) for both hiPhive and OMIMPrioritiser queries

@pnrobinson
Copy link

@damiansm Please note that there are multiple kinds of susceptibility and we do not want to get rid of all of them. I do not think our datasources cleanly distinguish between them -- consider using the HPO omit-list.txt file instead.

@damiansm
Copy link
Contributor Author

damiansm commented May 22, 2019

@pnrobinson Proposal was to use all the OMIM/Orphanet susceptibility genes as we always have but I can see there are useful and nots so useful ones so was not totally happy with that decision either. Did not know about omit-list.txt! Will take a look 👍

@damiansm
Copy link
Contributor Author

@pnrobinson Where does the HPO omit-list.txt file live?

@julesjacobsen
Copy link
Contributor

@pnrobinson where is the omit-list.txt? This doesn't directly impact on this issue this is purely for the code change to select from the database. The omit-list.txt change will need to go into a new data release, but this change will benefit from that when it happens.

@pnrobinson
Copy link

The file lives with the rest of the small files in the HPOA repo. If you are using phenotype.hpoa then all of the diseases that needed to be omitted have been.

@damiansm
Copy link
Contributor Author

damiansm commented May 22, 2019

@pnrobinson Ah - that is why I don't know about it as think it lives in https://github.com/monarch-initiative/hpo-annotation-data and I don't have access?

We use http://compbio.charite.de/jenkins/job/hpo.annotations/lastStableBuild/artifact/misc/phenotype_annotation.tab for Exomiser. I am presuming the same diseases have been omitted from this file but shout if not!

We do a join to our diseaseHp table so OMIM/Orphanet susceptibility genes so if they are excluding from that file they will be correctly excluded from the hiPhive and OMIM prioritisers and will be good.

So 205/526 OMIM susceptibility disease-gene associations in the disease table are excluded when joining to diseaseHp

@pnrobinson
Copy link

We are transitioning to phenotype.hpoa, here

http://compbio.charite.de/jenkins/job/hpo.annotations.2018/

It was described in the NAR19 paper.

I will add you to the other repo, sorry

@damiansm
Copy link
Contributor Author

damiansm commented May 22, 2019

Investigation revealed there are some 138 diseases in the https://github.com/monarch-initiative/hpo-annotation-data/blob/master/rare-diseases/annotated/omit-list.txt file that are included in our database and phenotype-annotation.tab.

Easiest solution will to be to switch our db build to use http://compbio.charite.de/jenkins/job/hpo.annotations.2018/lastSuccessfulBuild/artifact/misc_2018/phenotype.hpoa and this already has them excluded. Create a new issue for this: #351

@damiansm
Copy link
Contributor Author

This was released in 12.0.1

your-highness pushed a commit to LaborBerlin/hum-exomiser-fork that referenced this issue Mar 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants