Skip to content

Latest commit

 

History

History
136 lines (102 loc) · 6.32 KB

5.run_smetana_global_metrics.md

File metadata and controls

136 lines (102 loc) · 6.32 KB

🌐 5. Use SMETANA global algorithm to generate MIP & MRO metrics for community of GEMs

Refer to the SMETANA documentation for more details regarding usage.

$ smetana -h

usage: smetana [-h] [-c COMMUNITIES.TSV] [-o OUTPUT] [--flavor FLAVOR]
               [-m MEDIA] [--mediadb MEDIADB]
               [-g | -d | -a ABIOTIC | -b BIOTIC] [-p P] [-n N] [-v] [-z]
               [--solver SOLVER] [--molweight] [--exclude EXCLUDE]
               [--no-coupling]
               MODELS [MODELS ...]

Calculate SMETANA scores for one or multiple microbial communities.

positional arguments:
  MODELS                
                        Multiple single-species models (one or more files).
                        
                        You can use wild-cards, for example: models/*.xml, and optionally protect with quotes to avoid automatic bash
                        expansion (this will be faster for long lists): "models/*.xml". 

optional arguments:
  -h, --help            show this help message and exit
  -c COMMUNITIES.TSV, --communities COMMUNITIES.TSV
                        
                        Run SMETANA for multiple (sub)communities.
                        The communities must be specified in a two-column tab-separated file with community and organism identifiers.
                        The organism identifiers should match the file names in the SBML files (without extension).
                        
                        Example:
                            community1	organism1
                            community1	organism2
                            community2	organism1
                            community2	organism3
                        
  -o OUTPUT, --output OUTPUT
                        Prefix for output file(s).
  --flavor FLAVOR       Expected SBML flavor of the input files (cobra or fbc2).
  -m MEDIA, --media MEDIA
                        Run SMETANA for given media (comma-separated).
  --mediadb MEDIADB     Media database file
  -g, --global          Run global analysis with MIP/MRO (faster).
  -d, --detailed        Run detailed SMETANA analysis (slower).
  -a ABIOTIC, --abiotic ABIOTIC
                        Test abiotic perturbations with given list of compounds.
  -b BIOTIC, --biotic BIOTIC
                        Test biotic perturbations with given list of species.
  -p P                  Number of components to perturb simultaneously (default: 1).
  -n N                  
                        Number of random perturbation experiments per community (default: 1).
                        Selecting n = 0 will test all single species/compound perturbations exactly once.
  -v, --verbose         Switch to verbose mode
  -z, --zeros           Include entries with zero score.
  --solver SOLVER       Change default solver (current options: 'gurobi', 'cplex').
  --molweight           Use molecular weight minimization (recomended).
  --exclude EXCLUDE     List of compounds to exclude from calculations (e.g.: inorganic compounds).
  --no-coupling         Don't compute species coupling scores.

🗺️ Global algorithm

Refer to the methods sections of the SMETANA paper for details regarding the calculation of global scores:

  1. The metabolic interaction potential (MIP) measures the propensity of a given community to exchange metabolites, and is defined as the maximum number of essential nutritional components that a community can provide for itself through interspecies metabolic exchanges.
  2. The metabolic resource overlap (MRO) measures the degree of metabolic competition in a community, and is defined as the maximum possible overlap between the minimal nutritional requirements of all member species.

🍭 Flavors

If you used the --cobra flag in CarveMe, use the --flavor ucsd in SMETANA run to calculate global parameters MIP and MRO (metabolic resource overlap). For more details see this issue.

🤔 Your turn

Assuming that the $COMM folder contains your models, try running a few simulations using your community of GEMs

$ for i in {1..3}; do 
echo "Running simulation $i out of 3 ... "; 
smetana --flavor ucsd -o sim_${i} -v -g $ROOT/models/$COMM/*.xml;
done

Use the paste command to view generated output

$ paste sim_1_global.tsv 
community	medium	size	mip	mro
all	complete	5	4	0.7866666666666667

If you successfully generated detailed predictions for your community, summarize them into a single file. First create emtpy file with headers

$ echo -e "community\tmedium\tsize\tmip\tmro" >> ${COMM}_5_1_global.tsv

Loop through generated results to populate summary file named ${COMM}_5_1_global.tsv

$ while read file;do 
cat $file|grep -v comm|sed "s/all/$COMM/g";
done< <(find . -name "sim*global.tsv"|grep -v simulations) >> ${COMM}_5_1_global.tsv

Copy into appropriate subdirectory $ROOT/simulations/global/$COMM for the next script to read and plot

$ cp ${COMM}_5_1_global.tsv $ROOT/data/cooccurrence/simulation/mip_mro/${COMM}

Do not worry if you run out of time or are unable to generate detailed interactions for any reason, you already have pre-computed results that you may use for visualization and discussion.

🥑 Pre-computed simulations

Unless you submit jobs in parallel on the cluster it will not be feasible to generate 100's of simulation results within the timeframe of this course. The following code demonstrates how the detailed interactions were precomputed for the different datasets. ** You do not need to generate results for each community, as all results are pre-computed in their respective directories. **

The following shows how all global metrics were pre-computed, assuming your current folder contains your community's GEMs:

$ for i in {1..100}; do 
 echo "Running simulation $i out of 100 ... "; 
 smetana --flavor ucsd -o sim_${i} -v -g $ROOT/models/$COMM/*.xml;
done

🤌 Discussion questions

  • How does the SMETANA global algorithm work?
  • What are the underlying assumptions?
  • What does each metric measure and how is it calculated?
    • MIP
    • MRO
  • Does choice of media affect global score calculation?

Move on to exercise 6