Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: find_SNPs

Description

find_SNPs locates single nucleotide polymophisms (SNP) and deletion/insertion polymorphisms (DIP) in SAM type records in the stream. The approach used by find_SNPs is to parse the ALIGN field and for each event emit a record like this:

REC_TYPE: SNP
S_ID: gi|48994873|gb|U00096.2|
POS: 993405
EVENT: G>C
SNP_COUNT: 1
TYPE: MISMATCH
---

The position POS is 0-based and corresponds to the exact position in the subject sequence.

Usage

... | find_SNPs [options]

Options

[-?          | --help]               #  Print full usage description.
[-I <file!>  | --stream_in=<file!>]  #  Read input from stream file    -  Default=STDIN
[-O <file>   | --stream_out=<file>]  #  Write output to stream file    -  Default=STDOUT
[-v          | --verbose]            #  Verbose output.

Examples

Consider the following SAM entry in the file test.sam:

@SQ     SN:gi|48994873|gb|U00096.2|     LN:4639675
ID00081401      16      gi|48994873|gb|U00096.2|        3405    37      100M    *       0       0
CGCACGGGCGACATCTGGCAGGCTTCATTCACGCCTGCTATTCCCGTCAGCCTGAGCTTGCCGCGAAGCTGATGAAAGATGTTATCGCTGAACCCTAACC  *       XT:A:U  NM:i:2
X0:i:1  X1:i:0  XM:i:XO:i:0   XG:i:0  MD:Z:97C1G0

To locate SNPs in this file use read_sam like this:

read_sam -i test.sam | find_SNPs

REC_TYPE: SAM
Q_ID: ID00081401
STRAND: -
S_ID: gi|48994873|gb|U00096.2|
S_BEG: 3405
MAPQ: 37
CIGAR: 100M
SEQ: CGCACGGGCGACATCTGGCAGGCTTCATTCACGCCTGCTATTCCCGTCAGCCTGAGCTTGCCGCGAAGCTGATGAAAGATGTTATCGCTGAACCCTAACC
ALIGN: 97:C>A,99:G>C
---
REC_TYPE: SNP
S_ID: gi|48994873|gb|U00096.2|
POS: 973405
EVENT: C>A
SNP_COUNT: 1
TYPE: MISMATCH
---
REC_TYPE: SNP
S_ID: gi|48994873|gb|U00096.2|
POS: 993405
EVENT: G>C
SNP_COUNT: 1
TYPE: MISMATCH
---

See also

read_sam

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

mail@maasha.dk

September 2011

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

find_SNPs is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally