-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running multiple traits #11
Comments
Hi Sander, Yes, the current version of HDL only supports two traits testing at a time. I do have some code to save computational resources, but I have not made it clean and ready for users. Thanks for your interest in HDL! Best regards, |
Hi, Thanks! Sander |
Hi Sander, Thanks for the advice! I will try to summarize the code for multiple traits and write instructions in Wiki. Parallelization is needed for sure, but it takes much less time and memory to run HDL for multiple traits with the (hopefully) smarter code. To be more specific, in HDL, it takes a lot of time and memory to load the eigens of the reference panel. If we want to estimate rg between 10 traits (45 pairs), we actually only need to do the above step once. But if we run a normal HDL 45 times, then we do the above step 45 times, which is indeed a huge waste. It is tricky to pack the code into a package because the necessity of parallelization is different for different steps. So I plan to share the code firstly and we will see whether it can be improved :). Best, |
Hi Zheng, Thanks for all of your hard work on HDL. Picking up on this topic again - if one isn't so much interested in the genetic correlation matrix (10 traits = 45 pairs as you describe above; which is challenging to implement in ldsc too) and instead "just" wants to compute the correlations between one trait and multiple others (for example, coronary artery disease and HDL cholesterol, LDL cholesterol, systolic blood pressure [...]) is there an easier way to implement this than using an array for the second trait which would also require the reloading of the reference panel eigens repeatedly? I'm thinking something akin to that used in ldsc when you specify --rg CAD,HDL,LDL,SBP and get a summary printed near the bottom of the output file along the lines of:
(n.b. no HDL - LDL, HDL - SBP, LDL - SBP rows; just genetic correlations for the first trait listed and all those after it) This would make it much more tractable to perform high-throughput genetic correlation analyses for a single trait and is perhaps a little more in line with the analyses an investigator might typically wish to perform than a full matrix operation each time (ie, carry out a GWAS of trait A and then look for correlations with traits B-Z [estimating 25 genetic correlations], rather than an Ai,i matrix which would represent 325 genetic correlations for A-Z pairings if continuing our example - 300 of which may not be informative/required to tackle the question at hand). All the best, Steven |
Hi,
We wanted to run HDL using multiple traits (19 to be exact). Just checking, based on the instructions and the code: is it correct that one can only test two traits at a time?
Did you have a smart way implemented to test 19 traits?
Thanks
Sander
The text was updated successfully, but these errors were encountered: