Hey there! 👋 Welcome to my Cookiecutter-Docker-Research repo! If you've ever found yourself frustrated by trying to run deep learning projects on servers with messy, inconsistent environments, you're not alone.
Maybe you're still looking for an elegant remote development solution that works out-of-the-box, lets you connect and disconnect freely, and ensures no process is lost.
Or perhaps you want to package your entire project into a single file once it's done so you can save it, reproduce it anytime, or share it on GitHub without worrying about endless environmental issues.
Well, you've come to the right place! With Docker, PyTorch, and CUDA, you'll be up and running in under ten minutes.
Let's get started! 💪
- Rapid Setup: ⏰ Go from zero to hero in under ten minutes with easy, out-of-the-box configurations!
- PyTorch Integration with CUDA Support: 🚀 Harness the power of PyTorch and CUDA for cutting-edge deep learning applications, maximizing your GPU's potential with the PyTorch NGC Container.
- Non-root User Start-up: 🦪 Automatically configure a standard user without the complexity of Rootless mode, ensuring seamless access to files generated by the Docker container on the host.
- Easy Packaging and Reproduction: 📦 Simplify packaging and ensure easy reproducibility of your environment and projects.
- Optimized for Remote Development UX: 🔦 Enhance your development experience in Visual Studio Code with the Remote Development and Docker extensions.
- A currently supported Linux release (using outdated Linux releases is not recommended)
- Docker Engine
- NVIDIA GPU Drivers (no need to install the NVIDIA CUDA Toolkit separately)
- NVIDIA Container Toolkit
- Cookiecutter
Get your project set up with just five commands in ten minutes!
# Download and initialize the cookiecutter template in interactive mode
cookiecutter gh:venturi123/cookiecutter-docker-research
# Change the working directory to the project folder
cd {path/to/your/project}
# Build the Nvidia PyTorch Docker image
make init
# Start a Docker container instance
make create-container
# Attach to the container
# Use this command anytime you lose connection
make attach-container
# Test your image and container, and verify the NVIDIA/CUDA environment
make verify-cuda
# Finished!
echo "Everything is done! Enjoy it!"
Troubleshooting: Something went wrong? Want to remove everything? No problem!
# Remove all images and containers
make clean-docker
# Or just remove the container
make clean-container
# Be careful with the following command.
# It will remove the Docker image/container and also purge the project folder.
# Remove everything as if nothing happened
make destroy
Archiving: Finished the whole project and want to archive it? Not trusting network storage? Want to keep everything local? Just one command!
Caution
This feature is provided without any warranty for data security. It's recommended to follow proper backup protocols. Data backups should follow the 3-2-1 rule.
# The entire workdir and Docker environment will be packaged within a single file
make archive
The container will first be archived as {project_name}-image-{date}.tar
in the {project}
directory. Then, {project_name}.tar.gz
and {project_name}.tar.gz.sha256
files will be generated in the same directory.
Reproduction: Want to relive your experience from years ago but can't remember what you did on that unremarkable afternoon?
# Verify the integrity of the archive file to ensure it hasn't been altered or corrupted via SHA256
sha256sum -c {project_name}.tar.gz.sha256
# Unzip the archive
tar xvf {project_name}.tar.gz
cd {project_name}
# Reload the image
make reproduce
# Start a Docker container instance
make create-container
# Attach to the container
# Use this command anytime you lose connection
make attach-container
Just run the script in the folder. It's that simple!
Here's a shell script consolidating solutions for the longstanding issue with Nvidia-container-toolkit, based on the following references:
- NVIDIA/nvidia-container-toolkit#48
- NVIDIA/nvidia-container-toolkit#381 (comment)
- NVIDIA/nvidia-docker#1671 (comment)
- NVIDIA/nvidia-container-toolkit#386 (comment)
# Update /etc/nvidia-container-runtime/config.toml to set no-cgroups to false
sudo sed -i 's/^no-cgroups = true/no-cgroups = false/' /etc/nvidia-container-runtime/config.toml && echo "Set no-cgroups to false in config.toml."
# Backup /etc/docker/daemon.json if it exists, then update with new configuration
[ -f /etc/docker/daemon.json ] && sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.backup && echo "Backup created for daemon.json as daemon.json.backup."
# Add the necessary configuration to /etc/docker/daemon.json
sudo tee /etc/docker/daemon.json > /dev/null <<'EOF'
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
},
"exec-opts": ["native.cgroupdriver=cgroupfs"]
}
EOF
echo "Updated /etc/docker/daemon.json with a new configuration."
This script will apply the consolidated fixes by updating configurations for both Nvidia-container-runtime and Docker. Each step includes feedback messages for clarity.
Contributions are welcome! If you have improvements or bug fixes, please feel free to fork this repository and submit a pull request.
Big shoutout to Cookiecutter and cookiecutter-docker-science for the major inspiration! Also, massive thanks to everyone contributing to the PyTorch community and the CUDA wizards at NVIDIA for the tech support.
Feel free to fork, star, and contribute! Happy coding! 🙌