-
Notifications
You must be signed in to change notification settings - Fork 20
Run multiple DA tools on datasets using Docker Image
Welcome to the Comparison_of_DA_microbiome_methods wiki!
This page describes the use of a Docker image containing multiple microbiome Differential Analysis tools. If you have a working version of Docker installed on your system, you can jump to "Running multiple DA tools with Docker Image"
Docker can be installed on Linux, Mac as well as Windows. Here are a few links to install Docker on these operating systems
https://docs.docker.com/engine/install/ubuntu/
https://docs.docker.com/docker-for-windows/install/
https://docs.docker.com/docker-for-mac/install/
=== Tip to change the docker installation directory (where it stores the generated image) from the default / ===
On Ubuntu, by default, Docker is installed in the "/" folder. Building or pulling and using Docker Images occupies a sizable amount of harddrive space and by default, all these files end up in the "/" area. This runs the risk of clogging up the "/" area. The following is a protocol to change the default storage in Docker from "/" to another folder on a partition that has sufficient space. We will use systemd from Ubuntu to control Docker behaviour
sudo mkdir -p /etc/systemd/system/docker.service.d
sudo nano /etc/systemd/system/docker.service.d/docker-storage.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// --data-root="/path/to/new/docker_storage folder"
sudo systemctl daemon-reload
sudo systemctl restart docker
Check the change in directory by doing the following: It should now point to "/path/to/new/docker_storage folder"
docker info|grep -P "Docker Root Dir"
sudo groupadd docker
sudo gpasswd -a $USER docker
newgrp docker
docker pull dockerdkd/hackathon2021
This image can be used in three different ways:
- Running all the DA tools on any given individual dataset
- Running all the DA tools on all the datasets available at https://figshare.com/articles/dataset/16S_rRNA_Microbiome_Datasets/14531724
- Running all the analyses scripts mentioned in Nearing et al., 2021 https://www.biorxiv.org/content/10.1101/2021.05.10.443486v1
This command assumes that for any given dataset a minimum of three files are present in the Input Directory, with the following format of names
- $DATASETTAG_genus_table.tsv - the ASV table file at the genus level
- $DATASETTAG_meta.tsv - the metadata table
- $DATASETTAG_genus_table_rare.tsv - the rarefied ASV table at the genus level
mkdir DAtools_output
docker run --user root -it -e "DATASETTAG=ArcticFreshwaters" -e "DEPTH=2000" -e "FILTER=0.1" -v $PWD/Hackathon/Studies/ArcticFreshwaters/:/home/hackathonuser/Input_data -v $PWD/DATools_output/:/home/hackathonuser/output dockerdkd/hackathon2021:latest
In the above example, we used the Docker container to run all the DA tools on the ArcticFreshwaters dataset using a rarefaction depth of 2000 and a occurrence filter of 0,1 percent for removing taxa contributing <= 0.1 percent of total
By default, the output files generated will be under the ownership of "root". so we need to change the ownership of the output folder
chown -R <$USER> DAtools
chgrp -R <$USER> DAtools