Skip to content
Susann Vorberg edited this page Jun 5, 2018 · 5 revisions

Welcome to the Wiki for CCMgen and CCMpredPy

CCMgen is a Python toolkit for sampling protein-like sequences from a second-order Markov Randon Field model. By using second-order interactions, sampled protein sequences are more realistic than what can be sampled from e.g. a model using only a PSSM representation.

CCMgen is accompanied by CCMpredPy, a fast Python implementation of an evolutionary coupling method for learning a Markov Randon Field Model from the multiple sequence alignment of a protein family by either state-of-the-art pseudo-likelihood maximization (less accurate) or persistent contrastive divergence (recommended for use with CCMgen). The coupling potentials encoded by the learned Markov Random Field model can be used with CCMgen to generate new sequences.

For license information and installation instructions, please see the Readme file in the Github Repo.

Once you installed everything you can refer to the getting started guide for a short tutorial on how to use CCMgen and CCMpredPy with example data provided in the repository.

For detailed information on both tools including command line options you can refer to the documentation given in this Wiki: