Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle multi-genome build in gtc2vcf? #348

Open
rajwanir opened this issue Oct 18, 2024 · 0 comments
Open

How to handle multi-genome build in gtc2vcf? #348

rajwanir opened this issue Oct 18, 2024 · 0 comments
Assignees

Comments

@rajwanir
Copy link

rajwanir commented Oct 18, 2024

In most newer chips (GSA, confluence, and even some Omni series), Illumina has manifests available on single genome build so you wouldn't observe this issue. But for older or custom chips (such as Consortium-OncoArray), you would likely run into this issue. In my view, liftover coordinate to single genome at this stage (this could even be hg19) is the most straightforward solution so everything downstream is simple and clear. Even if you don't lift at this stage, at imputation or other downstream work, this would have to be either lifted or filtered out.

For example for Consortium-OncoArray, here is the distribution of markers per genome build:

genomeBuild markers
36 1191
36.2 1115
37 8718
37.1 522039
37.2 84
no build 484
total 533631

Workaround:
A possible option for gtc-to-bcf workflow for handling such manifests is to error out at config validation and let user know that is is a multi-genome manifest either liftover manifest or use gtc-to-ped workflow.

@rajwanir rajwanir self-assigned this Oct 18, 2024
rajwanir pushed a commit that referenced this issue Oct 31, 2024
The load_config() will throw exceptions if illumina_csv_bpm is old (i.e. missing columns or multi genome build).
Address #348
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant