Refer to the SMETANA documentation for more details regarding usage.
$ smetana -h
usage: smetana [-h] [-c COMMUNITIES.TSV] [-o OUTPUT] [--flavor FLAVOR]
[-m MEDIA] [--mediadb MEDIADB]
[-g | -d | -a ABIOTIC | -b BIOTIC] [-p P] [-n N] [-v] [-z]
[--solver SOLVER] [--molweight] [--exclude EXCLUDE]
[--no-coupling]
MODELS [MODELS ...]
Calculate SMETANA scores for one or multiple microbial communities.
positional arguments:
MODELS
Multiple single-species models (one or more files).
You can use wild-cards, for example: models/*.xml, and optionally protect with quotes to avoid automatic bash
expansion (this will be faster for long lists): "models/*.xml".
optional arguments:
-h, --help show this help message and exit
-c COMMUNITIES.TSV, --communities COMMUNITIES.TSV
Run SMETANA for multiple (sub)communities.
The communities must be specified in a two-column tab-separated file with community and organism identifiers.
The organism identifiers should match the file names in the SBML files (without extension).
Example:
community1 organism1
community1 organism2
community2 organism1
community2 organism3
-o OUTPUT, --output OUTPUT
Prefix for output file(s).
--flavor FLAVOR Expected SBML flavor of the input files (cobra or fbc2).
-m MEDIA, --media MEDIA
Run SMETANA for given media (comma-separated).
--mediadb MEDIADB Media database file
-g, --global Run global analysis with MIP/MRO (faster).
-d, --detailed Run detailed SMETANA analysis (slower).
-a ABIOTIC, --abiotic ABIOTIC
Test abiotic perturbations with given list of compounds.
-b BIOTIC, --biotic BIOTIC
Test biotic perturbations with given list of species.
-p P Number of components to perturb simultaneously (default: 1).
-n N
Number of random perturbation experiments per community (default: 1).
Selecting n = 0 will test all single species/compound perturbations exactly once.
-v, --verbose Switch to verbose mode
-z, --zeros Include entries with zero score.
--solver SOLVER Change default solver (current options: 'gurobi', 'cplex').
--molweight Use molecular weight minimization (recomended).
--exclude EXCLUDE List of compounds to exclude from calculations (e.g.: inorganic compounds).
--no-coupling Don't compute species coupling scores.
Refer to the methods sections of the SMETANA paper for details regarding the calculation of global scores:
- The metabolic interaction potential (MIP) measures the propensity of a given community to exchange metabolites, and is defined as the maximum number of essential nutritional components that a community can provide for itself through interspecies metabolic exchanges.
- The metabolic resource overlap (MRO) measures the degree of metabolic competition in a community, and is defined as the maximum possible overlap between the minimal nutritional requirements of all member species.
If you used the --cobra
flag in CarveMe, use the --flavor ucsd
in SMETANA run to calculate global parameters MIP and MRO (metabolic resource overlap). For more details see this issue.
Assuming that the $COMM
folder contains your models, try running a few simulations using your community of GEMs
$ for i in {1..3}; do
echo "Running simulation $i out of 3 ... ";
smetana --flavor ucsd -o sim_${i} -v -g $ROOT/models/$COMM/*.xml;
done
Use the paste
command to view generated output
$ paste sim_1_global.tsv
community medium size mip mro
all complete 5 4 0.7866666666666667
If you successfully generated detailed predictions for your community, summarize them into a single file. First create emtpy file with headers
$ echo -e "community\tmedium\tsize\tmip\tmro" >> ${COMM}_5_1_global.tsv
Loop through generated results to populate summary file named ${COMM}_5_1_global.tsv
$ while read file;do
cat $file|grep -v comm|sed "s/all/$COMM/g";
done< <(find . -name "sim*global.tsv"|grep -v simulations) >> ${COMM}_5_1_global.tsv
Copy into appropriate subdirectory $ROOT/simulations/global/$COMM
for the next script to read and plot
$ cp ${COMM}_5_1_global.tsv $ROOT/data/cooccurrence/simulation/mip_mro/${COMM}
Do not worry if you run out of time or are unable to generate detailed interactions for any reason, you already have pre-computed results that you may use for visualization and discussion.
Unless you submit jobs in parallel on the cluster it will not be feasible to generate 100's of simulation results within the timeframe of this course. The following code demonstrates how the detailed interactions were precomputed for the different datasets. ** You do not need to generate results for each community, as all results are pre-computed in their respective directories. **
The following shows how all global metrics were pre-computed, assuming your current folder contains your community's GEMs:
$ for i in {1..100}; do
echo "Running simulation $i out of 100 ... ";
smetana --flavor ucsd -o sim_${i} -v -g $ROOT/models/$COMM/*.xml;
done
- How does the SMETANA global algorithm work?
- What are the underlying assumptions?
- What does each metric measure and how is it calculated?
- MIP
- MRO
- Does choice of media affect global score calculation?