PomBase code for processing domains

This program processes the match_complete.xml.gz from InterPro and also runs TMHMM to generate a JSON of domain information.

The latest InterPro file is available from: https://ftp.ebi.ac.uk/pub/databases/interpro/current_release/

UniProt IDs for pombe proteins are queried from PostgreSQL. Those IDs are used to filter the InterPro file.

Protein sequences are queried from PostgreSQL and are passed to TMHMM. We run TMHMM in a separate thread while the InterPro XML is parsed and processed.

Running

Run with:

PATH=$PATH_TO_TMHMM_EXE:$PATH /var/pomcur/bin/pombase-interpro \
    -p "postgres://<username>:<password>@localhost/<dbname>" \
    -i <(gzip -d < match_complete.xml.gz) -o pombe_domain_results.json

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PomBase code for processing domains

Running

Status

About

Releases

Packages

Languages

pombase/pombase-domain-process

Folders and files

Latest commit

History

Repository files navigation

PomBase code for processing domains

Running

Status

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages