Releases · valdeanda/mebs

Data to customize PFAMS searches using mebs

The compressed directory contains:

PfamA database from 29/8/18: my_Pfam.pfam.hmm
obtained from => ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/
Random entropy values that are needed for the script mebs.pl to run: entropies.tab
Not used to compute any score in the custom option
Example mapping file: pfam2kegg.tab

Steps to follow

To compute the metabolic completeness of your genomic/metagenomic sample with custom Pfams do the following

Download this file, and place it under cycles/pfam directory.

wget https://github.com/valdeanda/mebs/releases/download/custom_pfam/custom_pfam.tar.gz

Decompress the file

tar -xvzf custom_pfam.tar.gz

3 The mapping file pfam2kegg.tab contains the set of PFAM marker genes described in Peura et al. 2015 (https://www.nature.com/articles/srep12102); you can modify these Pfams and add the Pfams of interest.

less pfam/pfam2kegg.tab

PFAM    KO      PATHWAY PATHWAY NAME
PF03598         1       WL
PF00101         2       Calvin Cycle
PF06240         3       CO oxidation 
PF14710         4       Denitrification
PF02665         4       Denitrification

Modify the config file to add the path of the new Pfam database as the following example. You don't need to specify the number of input genes or genomes for the Pfam database.

less config/config.txt

Cycle   Path    Comple  Input Genes     Input Genomes   Domains AUC     Score(FDR0.1)   Score(FDR0.01)  Score(FDR0.001) Score(FDR0.0001)
sulfur  cycles/sulfur/  cycles/sulfur/pfam2kegg.tab     152     161     112     0.985   4.156   8.049   10.816  12.285
carbon  cycles/carbon/  cycles/carbon/pfam2kegg.tab     135     90      119     0.988   9.735   18.744  34.26   34.908
oxygen  cycles/oxygen/          50      53      55      0.983   5.098   7.288   8.155   8.247
iron    cycles/iron/    cycles/iron/pfam2kegg.tab       36      34      112     0.863   7.412   9.571   10.241  10.322
nitrogen        cycles/nitrogen/        cycles/nitrogen/pfam2kegg.tab   267     144     176     0.791   15.974  17.7    18.785  19.03
pfam    cycles/pfam/    cycles/pfam/pfam2kegg.tab

Run mebs.pl normally, but take into account that you are now doing hmmsearches against the entire Pfam database, so depending on the number and size of your samples this step might take a while. However, once you've scanned your sample against the entire Pfam database, you can modify the mapping file as many times as you want and the results of completeness will take seconds.

 perl mebs.pl -input gen_test/ -type genomic -comp > test.pfam.tab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data to customize PFAMS searches using mebs

The compressed directory contains:

Steps to follow

Releases: valdeanda/mebs

Pfamv34. MEBS clustering

Custom Pfam database v1

Data to customize PFAMS searches using mebs

The compressed directory contains:

Steps to follow