Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronize cohorts with the NHGRI GWAS Catalog, when available #394

Merged
merged 9 commits into from
Nov 7, 2024

Conversation

ens-lgil
Copy link
Member

  • During the data import
  • During the release process (updating GWAS Catalog entries)

@ens-lgil ens-lgil requested a review from fyvon October 17, 2024 11:16
@ens-lgil ens-lgil linked an issue Oct 17, 2024 that may be closed by this pull request
@@ -37,6 +35,26 @@ def get_gwas_info(self,sample):
if response_data:
try:
source_PMID = response_data['publicationInfo']['pubmedId']

# Create list of cohorts if it exists in the GWAS study
# This override the Cohorts found previously in the cohort column in the spreadsheet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a warning/message if the spreadsheet cohorts are replaced? (including previous and new values)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, we can add that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will look like that:

# GCST00XXXX:
	1 distinct ancestries
	{'source_PMID': '1234567', 'sample_number': 23464, 'ancestry_broad': 'European', 'ancestry_country': 'U.K.'}
	/!\ Replacing cohorts list:
	  - Old set: ACTS, LASA
	  + New set: UKB
	>> SCORE updated: PGS00XXXX

with the new bits:

/!\ Replacing cohorts list:
  - Old set: ACTS, LASA
  + New set: UKB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might useful to know if it's just additions or things are missing? It's more likley to be correct if it's adding more annotations, but if it's removing an annotation we should be careful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's using the GCST, it should match what is in the GWAS Catalog ?
At the moment this script overwrite the Sample if the sample_number, sample_cases and sample_controls are NULL, i.e. GCSTs which were not released when the study was imported.

@smlmbrt smlmbrt changed the title Use the cohorts from the NHGRI GWAS Catalog, when available Sunchronize cohorts with the NHGRI GWAS Catalog, when available Oct 25, 2024
@smlmbrt smlmbrt changed the title Sunchronize cohorts with the NHGRI GWAS Catalog, when available Synchronize cohorts with the NHGRI GWAS Catalog, when available Oct 25, 2024
@fyvon fyvon merged commit 6e102aa into PGScatalog:master Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import GWAS Catalog Cohort information
3 participants