-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent disease queries for the hiPhive and OMIM prioritiser #337
Comments
@damiansm Please note that there are multiple kinds of susceptibility and we do not want to get rid of all of them. I do not think our datasources cleanly distinguish between them -- consider using the HPO omit-list.txt file instead. |
@pnrobinson Proposal was to use all the OMIM/Orphanet susceptibility genes as we always have but I can see there are useful and nots so useful ones so was not totally happy with that decision either. Did not know about omit-list.txt! Will take a look 👍 |
@pnrobinson Where does the HPO omit-list.txt file live? |
@pnrobinson where is the omit-list.txt? This doesn't directly impact on this issue this is purely for the code change to select from the database. The omit-list.txt change will need to go into a new data release, but this change will benefit from that when it happens. |
The file lives with the rest of the small files in the HPOA repo. If you are using phenotype.hpoa then all of the diseases that needed to be omitted have been. |
@pnrobinson Ah - that is why I don't know about it as think it lives in https://github.com/monarch-initiative/hpo-annotation-data and I don't have access? We use http://compbio.charite.de/jenkins/job/hpo.annotations/lastStableBuild/artifact/misc/phenotype_annotation.tab for Exomiser. I am presuming the same diseases have been omitted from this file but shout if not! We do a join to our diseaseHp table so OMIM/Orphanet susceptibility genes so if they are excluding from that file they will be correctly excluded from the hiPhive and OMIM prioritisers and will be good. So 205/526 OMIM susceptibility disease-gene associations in the disease table are excluded when joining to diseaseHp |
We are transitioning to phenotype.hpoa, here http://compbio.charite.de/jenkins/job/hpo.annotations.2018/ It was described in the NAR19 paper. I will add you to the other repo, sorry |
Investigation revealed there are some 138 diseases in the https://github.com/monarch-initiative/hpo-annotation-data/blob/master/rare-diseases/annotated/omit-list.txt file that are included in our database and phenotype-annotation.tab. Easiest solution will to be to switch our db build to use http://compbio.charite.de/jenkins/job/hpo.annotations.2018/lastSuccessfulBuild/artifact/misc_2018/phenotype.hpoa and this already has them excluded. Create a new issue for this: #351 |
This was released in 12.0.1 |
Prior to 12.0.0 we took all disease-gene associations in the diseases table regardless of association type from OMIM or Orphanet.
For the 1902_* dbs we cleaned up some of the Orphanet processing to fill in the association type consistently.
For 12.0.0 we modified the OMIMPrioritiser query to only use association type IN (disease and CNV) but for some reason left the hiPhive prioritiser selecting all which leads to some oddities i.e. you get a hiPhive to a disease but is not listed in the HTML of known diseases and the OMIMPrioritiser has non-intuitive behaviour e.g. score gets halved for a correct AD hit as the AD disease is type=?.
We have benchmarked on GeL cases and some HGMD simulated cases and found changing both to type=D,C,? improved precision but reduced recall in the top 5. Adding back in the S (susceptibility) type associations to both improved both recall and precision relative to 12.0.0 as Gel has a number of familial breast cancer diagnoses involving ATM etc.
Conclusion: push out a point release with type IN (D,C,?,S) for both hiPhive and OMIMPrioritiser queries
The text was updated successfully, but these errors were encountered: