phastDNA

Getting started

phastDNA can be run as a CLI program or as a GUI program. Both versions have exactly the same computational capabilities. GUI app is a user-friendly version intended for users who are not experts in bioinformatics or familiar with terminal.

System Requirements

There are different software requirements, depending on the usage of the app.

Running phastDNA from CLI only:

Python 3.8 or newer
UNIX-based operating system

Full phastDNA (CLI + GUI)

Python 3.8 or newer
UNIX-based operating system
Modern web browser (Chromium-based browsers are preferred, e.g. Google Chrome)
Internet connection

Hardware requirements:

Depending on the use case, the hardware requirements vary.

Prediction

multi-threaded processor - the more threads, the faster phastDNA should run
at least 4 GB RAM - this is a safe number for Edwards et al. dataset. For larger datasets or for prediction using models trained with longer k-mers or with more samples, you will likely need more RAM

Training

multi-threaded processor - the more threads, the faster phastDNA should run
=> 10 GB RAM - this is a safe number, since 10 GB was enough for most of the runs on Edwards et al. dataset. RAM usage also depends on settings - higher min and max k-mer size will result in higher memory allocation, as well as higher number of input sequences.
10 GB of disk space, under the same conditions as the memory requirement. Also, an SSD is recommended, since phastDNA performs quite a bit of IO operations. HDD will work but will be significantly slower.

(back to top)

Installation

Clone this repository or download the latest version from the Releases section:

git clone https://github.com/phenolophthaleinum/phastDNA.git
cd phastDNA

Create a virtual environment:

python -m venv venv && source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Optional: build fastDNA from source

Note If included binaries do not work, this step will be required.

Warning Only included source of fastDNA is compatible with phastDNA. If you would like to introduce any changes, use the included code, not from the fastDNA repository.

View fastDNA README

Optional: download Edwards et al. dataset for training

Currently, that's the only dataset that is compatible with training. This is the highest priority to standardise the usage of different datasets. Original link to the dataset is not available anymore, but its copy is hosted at AMU Computational Biology Department:

wget http://combio.pl/files/edwards2016.zip

(back to top)

Usage

CLI version

Only minimal use cases are shown here. For full description, type:

python phastdna.py -h

Prediction

python phastdna.py -O output_dir/ -C path_to_classifier/ -v path_to_virus_fastas/

Training

python phastdna.py -O output_dir/ -H path_to_host_dataset/ -V path_to_virus_dataset/

GUI version

To run the webapp version, execute:

python phastDNA_gui.py

The webapp is hosted on computer's localhost. The address will be printed once the app is started. Open that address in your browser to access phastDNA GUI. The address most likely will be: http://127.0.0.1:5000/

(back to top)

Roadmap

(back to top)

License

Distributed under the GNU General Public License v3.0. See LICENSE.txt for more information.

(back to top)

References

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
fastDNA		fastDNA
static		static
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html
install.sh		install.sh
learning.py		learning.py
phastDNA_gui.py		phastDNA_gui.py
phastdna.py		phastdna.py
requirements.txt		requirements.txt
scoring.py		scoring.py
task-predict.html		task-predict.html
task-train.html		task-train.html
task.html		task.html
taxonomy.py		taxonomy.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

phastDNA

Getting started

System Requirements

Running phastDNA from CLI only:

Full phastDNA (CLI + GUI)

Hardware requirements:

Prediction

Training

Installation

Usage

CLI version

Prediction

Training

GUI version

Roadmap

License

References

About

Releases 1

Packages

Contributors 2

Languages

License

phenolophthaleinum/phastDNA

Folders and files

Latest commit

History

Repository files navigation

phastDNA

Getting started

System Requirements

Running phastDNA from CLI only:

Full phastDNA (CLI + GUI)

Hardware requirements:

Prediction

Training

Installation

Usage

CLI version

Prediction

Training

GUI version

Roadmap

License

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages