Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect assignment of flag 0x2 (read mapped in proper pair) #61

Open
sam-israel opened this issue Oct 18, 2020 · 1 comment
Open

Incorrect assignment of flag 0x2 (read mapped in proper pair) #61

sam-israel opened this issue Oct 18, 2020 · 1 comment

Comments

@sam-israel
Copy link

In the resulting BAM file of a TopHat (v2.1.1, bowtie2 version 2.3.4.3) run, there are reads that do map as "read mapped in proper pair" (their flags "include" the flag 0x2); however their YT flag has YT:Z:UU value, which indicates that they were not part of a pair.

These is an example of reads out of the mapped file :

A01056:33:HF3NFDSXY:1:2516:13657:30718 435 1 91387362 0 117M 21 8218147 0 CCTGTGGTAACTTTTCTGACACCTCCTGCTTAAAACCCAAAAGGTCAGAAGGATCGTGAGGCCCCGCTTTCACGGTCTGTATTCGTACTGAAAATCAAGATCAAGCGAGCTTTTGCC :FF:F:FFFF:FFFFFFFFFFFFFF:FF,FFF,FFFFFF:FFF:FFFFF:FF:FF:FFFFFFF:FFFFFFFFF:FFFFFFFFF,FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:117 YT:Z:UU NH:i:20 CC:Z:= CP:i:91387362 XS:A:- HI:i:2

A01056:33:HF3NFDSXY:1:2516:13657:30718 371 21 8218147 0 112M 1 91387362 0 GGGCAAAAGCTCGCTTGATCTTGATTTTCAGTACGAATACAGACCGTGAAAGCGGGGCCTCACGATCCTTCTGACCTTTTGGGTTTTAAGCAGGAGGTGTCAGAAAAGTTAC :F:FFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFF,FFFFFFF:FFFF::FFFFFFFFFF:FFFFFFFFFFFFFFFFFFF AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:112 YT:Z:UU NH:i:20 CC:Z:GL000220.1 CP:i:161594 XS:A:+ HI:i:2

It can be seen that the mate alignment on chromosome 21, while a read is aligned on chromosome 1 - so the setting of unpaired is the correct one.

The command was

tophat --mate-inner-dist -139 --mate-std-dev 50 -o align/Sample10 -G /.../Homo_sapiens/Ensembl/GRCh38/Annotation/Genes/genes.gtf -N 10 --read-gap-length 5 --read-edit-dist 15 --segment-length 20 --read-realign-edit-dist 3 --no-coverage-search --library-type fr-firststrand -p 32 /.../Homo_sapiens/Ensembl/GRCh38/Sequence/Bowtie2Index/genome Sample10_R1_clean_pe.fastq.gz Sample10_R2_clean_pe.fastq.gz,processed/Sample10_R1_clean_se.fastq.gz,processed/Sample10_R2_clean_se.fastq.gz

Is there any information about this bug?

Does this seem to be a bowtie/tophat error?

@sam-israel
Copy link
Author

In the majority of the cases for this file however, the error is not the sam file flag, but the YT:Z:UU flag.

In this run, tophat has received both PE and SR reads. About 98% were PE. Despite that (subsampling the file), about 98% are mapped with the YT:Z:UU flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant