Skip to content

Getting Started with CCMgen and CCMpredPy

Susann Vorberg edited this page Jun 4, 2018 · 8 revisions

Preparing Data

Find an example alignment taken from the PSICOV supplementary data in extra/examples/1atzA.fasta. In general, you can run CCMpred / CCMgen on any FASTA-formatted multiple sequence alignment although results will be better the more sequences you have per column in the alignment.

Copy over extra/examples/1atzA.fasta to data/1atzA.fasta which will be our working directory.

Example Usage via Command Line

Print available command line options:

python PATH/TO/LOCAL/INSTALL/ccmpred/scripts/run_ccmpred.py -h

or

ccmpred -h

Per default (--ofn-pll) CCMpredPy maximizes the pseudo-likelihood to obtain couplings. Results differ slightly from the C implementation of CCMpred due to the following modifications:

  • single potential regularization prior is centered at ML estimate of single potentials v*
  • single potentials are initialized at v*
  • regularization strength lambda_v = 10 in order to achieve comparable results to C implementation of CCMpred
  • slight modification in the conjugate gradient optimizer compared to libconjugrad used in CCMpred

This command will print the optimization progress to stdout and produce the file ./example/1mkcA00.frobenius.mat:

ccmpred ./example/1mkcA00.aln ./example/1mkcA00.mat

The opimization progress can be visualized as an interactive plotly graph by additionaly specifying the --plot_opt_progress flag. The html file containing the graph is updated during optimization and will be written to ./example/1mkcA00.opt_progress.html.

ccmpred --plot_opt_progress ./example/1mkcA00.aln ./example/1mkcA00.mat

Bias correction can be switched on by using the flags --apc and --entropy-correction. Using these two additional flags will generate three contact map files: ./example/1mkcA00.frobnenius.mat, ./example/1mkcA00.frobnenius.apc.mat and ./example/1mkcA00.frobnenius.ec.mat:

ccmpred --apc --entropy-correction ./example/1mkcA00.aln ./example/1mkcA00.mat

Contact maps can be visualized using the script plot_contact_map.py. By specifying a PDB file (numbering of amino acids starting at 1!), the distance matrix is plotted in the lower right triangle. By specifying an alignment file, the percentage of gaps and the entropy are plotted as subplot.

plot_contact_map --mat-file ./example/1mkcA00.frobenius.mat --alignment-file ./example/1mkcA00.aln --pdb-file ./example/1mkcA00.pdb --plot-out ./example/ --seq-sep 4 --contact-threshold 8 --apc

License

GNU Affero General Public License, Version 3.0