-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(gwas catalog sumstats): finemapping #51
Merged
project-defiant
merged 16 commits into
dev
from
szsz-gwas-catalog-sumstat-locus-breaker
Oct 25, 2024
Merged
feat(gwas catalog sumstats): finemapping #51
project-defiant
merged 16 commits into
dev
from
szsz-gwas-catalog-sumstat-locus-breaker
Oct 25, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@addramir ignore the docs for now. I am changing them as soon as the dag is successful |
d949832
to
dcde45e
Compare
DSuveges
reviewed
Oct 25, 2024
src/ot_orchestration/dags/config/gwas_catalog_sumstats_susie_clumping.yaml
Outdated
Show resolved
Hide resolved
DSuveges
approved these changes
Oct 25, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comment. All looks good and sensible, however I have to admit, I haven't run them. :D
@DSuveges thank you! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
We want to perform locus breaker clumping and SuSiE finemapping on harmonised summary statistics comming from GWAS Catalog.
Implementations
This PR implements:
locus_breaker_clumping
andstudy_index
generation for GWAS Catalog summary statistics.Note
Locus Breaker Clumping performance
The performance of LB clumping was not ideal. The step took ~2h to compute the StudyLocus starting from 69K harmonised summary statistics. See dataproc job.
This situation is a partially the result of the largely distributed dataset - see the first spike in nodes representing the first job to list all parquet files in subdirectories.
The number of loci resulted from clumping oscilated ~440K.
Running code with this branch we were able to perform the fine-mapping of the 441k loci in 7h.
The way how the finemapping works:
This approach is not ideal due to the number of google API calls (knowledge post mortem - see distrubution of the calls in the buckets on 23rd of October ) we need to make when running
list.objects
, the better solution would be to:This could be implemented as an enhancement in the future.