Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amplicon variant calling - empty file #191

Open
SheepwormJM opened this issue Oct 9, 2024 · 0 comments
Open

Amplicon variant calling - empty file #191

SheepwormJM opened this issue Oct 9, 2024 · 0 comments

Comments

@SheepwormJM
Copy link

SheepwormJM commented Oct 9, 2024

Hi,

I am getting an empty output file using vardict-java in amplicon mode, and I am not sure why.

I have amplicons which I have already cleaned (removed adapters, demultiplexed by inline barcodes and removed primers), run through DADA2 and identified the 'correct' ASVs. I've then used PEAR to error correct and provide consensus reads from the reads without the primers and selected those that are the exact same sequences as the ASV haplotypes output by DADA2.

Then, I've aligned to my reference using bwa mem and produced a sorted bam file. This definitely has aligned reads in it.

This is my vardict java command (I installed vardict java using conda conda install -c conda-forge -c bioconda vardict-java, version 1.8.3).

REFERENCE_GENOME=Mygenome.fa
SAMPLE_NAME=Mysample_F1R1
BAM_FILE=Mysample.sorted.bam
MAPQ=30
BASEQ=30

/users/jmi45g/project0005/conda/envs/vardictjava/bin/vardict-java -G ${REFERENCE_GENOME} -N ${SAMPLE_NAME} -b ${BAM_FILE} -Q ${MAPQ} -q ${BASEQ} -a 10:0.95 -F 0 -h -c 1 -S 2 -E 3 -s 7 -e 8 -g 4 LOCI.bed```

I've tried removing the requirement for the minimum base quality (-q) in case there was nothing/something abnormal following PEAR.

I've also tried altering the amplicon option based on this information from the manual:

"-a|--amplicon int:float Indicate it is amplicon based calling. Reads do not map to the amplicon will be skipped. A read pair is considered belonging the amplicon if the edges are less than int bp to the amplicon, and overlap fraction is at least float. Default: 10:0.95"

If I make the 'int bp' larger (e.g. 40 or 100) then it says it is too big - presumably because my bed file columns 7 and 8 (amplicon without primers coordinates) and 2 and 3 (full amplicon co-ordinates), then do not match.

However, if I make it similar to the primer length I still get an empty file - just the header line.

If anyone knows what I am doing wrong then any advice would be much appreciated! I realise that vardict can work with amplicon reads which still contain the primers, but I do not know if having removed them would have caused the issue. Presumably it doesn't need them to be exact to allow for indels etc, and given that the amplicon would be accepted if it started within 10 bp of the start co-ordinate (column 2/3) - assuming I've understood correctly...

Thanks in advance,

Jenni

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant