Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bcftools norm --multialllelics -any v1.12 does not split FORMAT/AD correctly #1499

Open
priesgo opened this issue May 30, 2021 · 1 comment

Comments

@priesgo
Copy link

priesgo commented May 30, 2021

I hope this is not me misusing bcftools, but there seens to be an issue between bcftools norm v1.9 and v1.12 (I did not check intermediate versions).

chr1	13324	.	C	G,T	.	MULTIALLELIC		GT:AD	0:229,1,1	1/2:196,24,1

With bcftools v1.12 bcftools norm --multiallelics -any -old-rec-tag OLD_VARIANT becomes:

chr1	13324	.	C	G	.	MULTIALLELIC	OLD_VARIANT=chr1|13324|C|G,|1	GT:AD	0:229,229	1/0:196,196
chr1	13324	.	C	T	.	MULTIALLELIC	OLD_VARIANT=chr1|13324|C|G,|2	GT:AD	0:229,229	0/1:196,196

Note, that AD values always get the value from the reference base. Also, the value stored in INFO/OLD_VARIANT refers to the G alternate in both cases.

With bcftools v1.9 bcftools norm --multiallelics -any becomes:

chr1	13324	.	C	G	.	MULTIALLELIC	.	GT:AD	0:229,1	1/0:196,24
chr1	13324	.	C	T	.	MULTIALLELIC	.	GT:AD	0:229,1	0/1:196,1
pd3 added a commit that referenced this issue Jun 9, 2021
@pd3
Copy link
Member

pd3 commented Jun 9, 2021

Can you please try if the commit I just pushed fixes the issue? I could reproduce the problem only partially, the commit fixes the malformed INFO tag. However, I was not able to reproduce the incorrect FORMAT/AD values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants