Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 5 revisions

Biopiece: align_pair_seq

Description

align_pair_seq creates a pairwise alignment of the first two sequences in the stream.

The algorithm used works by locating all Maximally Extended Matches (MEMs) between two sequences within a given search space and seeded by matches of a given kmer size. All MEMs located are scored according to the length and distance to the closest diagonal in the search space and the highest scoring MEM is picked and added to the alignment chain. Next the right and left search space is recursed into.

Usage

... | align_pair_seq [options]

Options

[-?         | --help]               #  Print full usage description.
[-I <file!> | --stream_in=<file!>]  #  Read input from stream file -  Default=STDIN
[-O <file>  | --stream_out=<file>]  #  Write output to stream file -  Default=STDOUT
[-v         | --verbose]            #  Verbose output.

Examples

Consider the following FASTA entries in the file test.fna:

>test1
tgcatgctagctatagccgtttgtacgatggctagccagcag
>test2
tgctacctagctagcatacgtacgatgatgctaggtagctgg

To create a pairwise alignment of these do:

read_fasta -i test.fna | align_pair_seq | write_align -x

                      .         .         .           .   
test1       TGCaTg-CTAGCTAtaGCcgTttGTACGATG--GCTAGccAGCa-G
            ||| |  |||||||  ||  |  ||||||||  |||||  |||  |
test2       TGC-TacCTAGCTA--GCa-TacGTACGATGatGCTAGgtAGCtgG
                      .            .         .         .  

See also

read_fasta

align_seq

write_align

write_fasta

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

mail@maasha.dk

June 2009

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

align_pair_seq is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally