-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synchronize cohorts with the NHGRI GWAS Catalog, when available #394
Conversation
ens-lgil
commented
Oct 17, 2024
- During the data import
- During the release process (updating GWAS Catalog entries)
@@ -37,6 +35,26 @@ def get_gwas_info(self,sample): | |||
if response_data: | |||
try: | |||
source_PMID = response_data['publicationInfo']['pubmedId'] | |||
|
|||
# Create list of cohorts if it exists in the GWAS study | |||
# This override the Cohorts found previously in the cohort column in the spreadsheet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get a warning/message if the spreadsheet cohorts are replaced? (including previous and new values)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, we can add that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will look like that:
# GCST00XXXX:
1 distinct ancestries
{'source_PMID': '1234567', 'sample_number': 23464, 'ancestry_broad': 'European', 'ancestry_country': 'U.K.'}
/!\ Replacing cohorts list:
- Old set: ACTS, LASA
+ New set: UKB
>> SCORE updated: PGS00XXXX
with the new bits:
/!\ Replacing cohorts list:
- Old set: ACTS, LASA
+ New set: UKB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might useful to know if it's just additions or things are missing? It's more likley to be correct if it's adding more annotations, but if it's removing an annotation we should be careful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's using the GCST, it should match what is in the GWAS Catalog ?
At the moment this script overwrite the Sample if the sample_number
, sample_cases
and sample_controls
are NULL, i.e. GCSTs which were not released when the study was imported.
…he one alreday stored in the database
… from the GWAS study (fetched via the GWAS REST API)
…AS study) is merged with the list of cohorts from the GWAS Catalog REST API