Skip to content

bacteriaanalysissystemenglish

gaou edited this page Oct 26, 2020 · 18 revisions

Overview

BAS is a collection of perl modules that facilitate the development of perl scripts for bio-informatics applications. This document is a tutorial for BAS. In order to take advantage of BAS, all you have to do is change a few setup.

How to change a setup

Let's take an example. (1) Open the configuration file[BAS.conf]

        $ emacs BAS.conf 
  • Following is contents of BAS
  ###############################################
  # Bacteria Analysis System configuration file>#
  ###############################################
  #$Id: BAS.conf,v 1.6 2001/09/09 11:53:20 s98982km Exp $
  #scripted by Koya Mori(mory@g-language.org)
  #This is a configuration of Bacteria Analysis System
   
  package G::System::BAS_conf; 
   
  use strict("var"); 
   
  sub BAS{ 
    my $methods; 
   
    $methods= <<'CONF'; 
  
  ##################################### 
  #  G instances                      # 
  ##################################### 
  
  gb1 < /pub/dnadb/ncbi/genbank/genomes/bacteria/Ecoli/ecoli.gbk 
     
  ##################################### 
  #  CAI                              # 
  ##################################### 
    
  >cai            N
    
  G-instance      $gb
  
  -output                 #[f or stdout] default:"stdout"
  
  -w_output               #[f or stdout] default:"stdout"
  
  -w_filename 
     . 
     . 
     . 
    (Abbreviation) 
     . 
     . 
     .

We have 10 classes and 24 methods.

  • CAI
    • cai
  • Codon
    • codon_counter
    • amino_counter
    • codon_usage
  • Consensus
    • base_counter
    • base_information_content
    • base_z_value
    • base_entropy
    • base_relative_entropy
    • base_individual_information_matrix
  • GCskew
    • view_cds
    • find_ori_ter
    • gcskew
    • genomicskew
    • cum_gcskew
    • gcwin
  • Free Energy
    • foreach_RNAfold
  • Markov
    • over_lapping_finder
  • Over Lapping Gene
    • over_lapping_finder
    • Pat Search
    • palindrome
  • Tandem Repeat
    • foreach_tandem
    • graphical_LTR_search
  • Util
    • seq2gif
    • genome_map

(2) choose an analysis data

Default data is Escherichia_coli_K12. If you want to use another data, you have to change [gb1]. Also you can add a variable (ex.[mgen]).

  • Before
    ##################################### 
    #  G instances                      # 
    #####################################   
    gb1 < /pub/dnadb/ncbi/genbank/genomes/bacteria/Ecoli/ecoli.gbk 
  • After
    ##################################### 
    #  G instances                      #
    ##################################### 
  gb1 < /pub/dnadb/ncbi/genbank/genomes/bacteria/Ecoli/ecoli.gbk 
  mgen < /pub/dnadb/ncbi/genbank/genomes/bacteria/Mgen/mgen.gbk    # add 

(3) Choose a method

This time we use [codon_usage](codon class),for example.

・How to use [codon_usage]

1、First, turn [N] to [Y](this means switch on) and choose the analysis data at [G-instance]. If you want to put a output data into a file, change output option to [f]. You can find the file in [mgen]directory. And if you want to output a graph data, change output option to [g] (the graph is saved into [mgen/graph]). Default option [show] means show all results including graph and standard output results automatically.

-output f (put analysis data into a file)
g (output a graph analysis data)
show (show all analysis data automatically)
-filename free (using[-output] option, you can named filename)
  • Before
  >codon_usage    N 
  G-instance      $gb 
  -CDSid 
  -output         show    #[f or g or show] default:"show" 
  -filename 
  • After
  >codon_usage    Y       # switch on 
  G-instance      gb1     # choose data 
  -CDSid 
  -output         g       #[f or g or show] default:"show" 
  -filename 
  1. save the file.

  2. run the BAS

          $./BAS 
  • Follwing is standard output results

               __/__/__/__/__/__/__/__/__/__/__/__/__/ 
    
                       G-language System 
    
                Version: 1.0.0 gamma 
    
                Copyright (C) 2001 G-language Project 
                Institute of Advanced Biosciences, 
                Keio University, JAPAN 
    
                   http://www.g-language.org/ 
    
               __/__/__/__/__/__/__/__/__/__/__/__/__/ 
    
    Length of Sequence :    816394 
             A Content :    249211 (30.53%) 
             T Content :    240560 (29.47%) 
             G Content :    163703 (20.05%) 
             C Content :    162920 (19.96%) 
                Others :         0 (0.00%) 
            AT Content :    59.99% 
            GC Content :    40.01% 
                             . 
                             . 
                         (Abbreviation) 
                             . 
                             . 
    
  1. output the graph
  $ cd mgen/graph 
  $ gimp codon_table.gif 
  • If you want to put analysis data into a file.
  • change output option to [f].
  • save the file
  • run the BAS.
  • output the file
  $ cd mgen
  $ emacs codon_usage.csv 
  • Following is standard output results
  /,taa,348,0.725 
  /,tag,132,0.275 
  A,gca,3759,0.386 
  A,gcc,730,0.075 
  A,gcg,462,0.047 
  A,gct,4799,0.492 
  C,tgc,295,0.203 
  C,tgt,1160,0.797 
  D,gac,1196,0.139 
  D,gat,7388,0.861 
      . 
      . 
      . 
   (Abbreviation) 
      . 
      . 
      . 

Notes

There is difference between [Essential] and [option]. [option] has [-] before option name(ex. -CDSid ), but [Essential] hasn't. You must to put variable into [Essential], but [option] is not.

codon_usage Y

G-instance gb1 ## Essential ## -CDSid # option # -output f #[f or g or show] default:"show" # option # -filename # option #

Written 10 September 2001 - Ryo Hattori

Clone this wiki locally