Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce an ELSE statement to deal with edge cases #27

Closed
ChristopheLegendre opened this issue Jan 25, 2023 · 3 comments
Closed

Introduce an ELSE statement to deal with edge cases #27

ChristopheLegendre opened this issue Jan 25, 2023 · 3 comments

Comments

@ChristopheLegendre
Copy link
Collaborator

ChristopheLegendre commented Jan 25, 2023

if [[ ${C} -eq 0 ]]
then
echo -e "No Hets in VCF, so No Phasing to perform; Skipping phASER" 1>&2
echo "Making the expected VCF filename as if phASER had run:"
echo "zcat \"${VCF_ORIGINAL_INPUT}\" \"${VCF_ORIGINAL_INPUT/.vcf.gz/.blocs.vcf}\" " 1>&2
zcat "${VCF_ORIGINAL_INPUT}" > "${VCF_ORIGINAL_INPUT/.vcf.gz/.blocs.vcf}"
echo "${VCF_ORIGINAL_INPUT/.vcf.gz/.blocs.vcf}" 2>&1
echo "Exiting $0 script without having phASER ran. ev = 0" 1>&2
exit 0
fi

Example of edge case:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	NORMAL	TUMOR
chr14	50978766	.	G	T	.	PASS	SOMATIC;QSS=60;TQSS=1;NT=ref;QSS_NT=60;TQSS_NT=1;SGT=GG->GT;DP=115;MQ=59.97;MQ0=0;ReadPosRankSum=1.22;SNVSB=0;SomaticEVS=7.38	GT:DP:FDP:SDP:SUBDP:AU:CU:GU:TU:AR:AD	0/0:21:0:0:0:0,0:0,0:21,21:0,0:0:21,0	0/1:94:6:0:0:0,0:0,0:81,87:7,7:0.0795:81,7
chr14	50978767	.	T	TG	.	PASS	SOMATIC;QSI=47;TQSI=1;NT=ref;QSI_NT=47;TQSI_NT=1;SGT=ref->het;MQ=59.96;MQ0=0;RU=G;RC=0;IC=1;IHP=18;SomaticEVS=6.74	GT:DP:DP2:TAR:TIR:TOR:DP50:FDP50:SUBDP50:BCN50:AR:AD	0/0:19:19:20,20:0,0:0,0:25.1:0.58:0:0:0:20,0	0/1:90:90:71,73:14,14:9,7:100.19:4.93:0:0.04:0.1647:71,14

This is a SNV followed by an indel;
phASER excludes the indel and only the snv remains which is not part of a block anymore --> Therefore phASER fails.

Let's think about any other potential edge case that the phASER tool could be grumpy about and fails because it does not handle single lines after removing or not dealing with the indels anymore.

@PedalheadPHX
Copy link
Member

Do we have a picture of the actual event? is the indel in phase or not, if we are not able to phase indels, then shouldn't preprocess just drop all lines with indels before looking for potential block variants where two consecutive vcf lines are one bp apart to send to phaser. Short if indels are a known phaser limitation why are they not prefiltered

@ChristopheLegendre
Copy link
Collaborator Author

We do not phase indels.
Phaser takes care of indels by excluding them itself; that is why they were not pre-filtered; Because it avoided us to add more coding lines and creating more intermediate files, we were letting phASER doing the work.

With this edge case, it appears we can not let phASER do the work as phASER does not handle correctly the remaining orphan lines when two consecutives lines involve a SNV followed by an InDel.

The solution is to prefilter the indels out. We have been working on it and implemented it.
Tests of modifications in progress.

@ChristopheLegendre
Copy link
Collaborator Author

Indels prefiltering has been implemented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants