How to handle multi-genome build in gtc2vcf? #348

rajwanir · 2024-10-18T16:50:34Z

In most newer chips (GSA, confluence, and even some Omni series), Illumina has manifests available on single genome build so you wouldn't observe this issue. But for older or custom chips (such as Consortium-OncoArray), you would likely run into this issue. In my view, liftover coordinate to single genome at this stage (this could even be hg19) is the most straightforward solution so everything downstream is simple and clear. Even if you don't lift at this stage, at imputation or other downstream work, this would have to be either lifted or filtered out.

For example for Consortium-OncoArray, here is the distribution of markers per genome build:

genomeBuild	markers
36	1191
36.2	1115
37	8718
37.1	522039
37.2	84
no build	484
total	533631

Workaround:
A possible option for gtc-to-bcf workflow for handling such manifests is to error out at config validation and let user know that is is a multi-genome manifest either liftover manifest or use gtc-to-ped workflow.

The load_config() will throw exceptions if illumina_csv_bpm is old (i.e. missing columns or multi genome build). Address #348

rajwanir self-assigned this Oct 18, 2024

rajwanir pushed a commit that referenced this issue Oct 31, 2024

Adds config validation checks for illumina_csv_bpm

271ff30

The load_config() will throw exceptions if illumina_csv_bpm is old (i.e. missing columns or multi genome build). Address #348

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle multi-genome build in gtc2vcf? #348

How to handle multi-genome build in gtc2vcf? #348

rajwanir commented Oct 18, 2024 •

edited

Loading

How to handle multi-genome build in gtc2vcf? #348

How to handle multi-genome build in gtc2vcf? #348

Comments

rajwanir commented Oct 18, 2024 • edited Loading

rajwanir commented Oct 18, 2024 •

edited

Loading