Skip to content

Launch your ready-to-run PyTorch + CUDA deep learning project in just 10 minutes! 🦩

License

Notifications You must be signed in to change notification settings

venturi123/cookiecutter-docker-research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cookiecutter-Docker-Research

Hey there! 👋 Welcome to my Cookiecutter-Docker-Research repo! If you've ever found yourself frustrated by trying to run deep learning projects on servers with messy, inconsistent environments, you're not alone.

Maybe you're still looking for an elegant remote development solution that works out-of-the-box, lets you connect and disconnect freely, and ensures no process is lost.

Or perhaps you want to package your entire project into a single file once it's done so you can save it, reproduce it anytime, or share it on GitHub without worrying about endless environmental issues.

Well, you've come to the right place! With Docker, PyTorch, and CUDA, you'll be up and running in under ten minutes.

Let's get started! 💪

Features

  • Rapid Setup: ⏰ Go from zero to hero in under ten minutes with easy, out-of-the-box configurations!
  • PyTorch Integration with CUDA Support: 🚀 Harness the power of PyTorch and CUDA for cutting-edge deep learning applications, maximizing your GPU's potential with the PyTorch NGC Container.
  • Non-root User Start-up: 🦪 Automatically configure a standard user without the complexity of Rootless mode, ensuring seamless access to files generated by the Docker container on the host.
  • Easy Packaging and Reproduction: 📦 Simplify packaging and ensure easy reproducibility of your environment and projects.
  • Optimized for Remote Development UX: 🔦 Enhance your development experience in Visual Studio Code with the Remote Development and Docker extensions.

Getting Started

Prerequisites

Quick Start 🌪️

Get your project set up with just five commands in ten minutes!

# Download and initialize the cookiecutter template in interactive mode
cookiecutter gh:venturi123/cookiecutter-docker-research

# Change the working directory to the project folder
cd {path/to/your/project}

# Build the Nvidia PyTorch Docker image
make init

# Start a Docker container instance
make create-container

# Attach to the container
# Use this command anytime you lose connection
make attach-container

# Test your image and container, and verify the NVIDIA/CUDA environment
make verify-cuda

# Finished!
echo "Everything is done! Enjoy it!"

Troubleshooting: Something went wrong? Want to remove everything? No problem!

# Remove all images and containers
make clean-docker

# Or just remove the container
make clean-container

# Be careful with the following command.
# It will remove the Docker image/container and also purge the project folder.
# Remove everything as if nothing happened
make destroy

Archiving: Finished the whole project and want to archive it? Not trusting network storage? Want to keep everything local? Just one command!

Caution

This feature is provided without any warranty for data security. It's recommended to follow proper backup protocols. Data backups should follow the 3-2-1 rule.

# The entire workdir and Docker environment will be packaged within a single file
make archive

The container will first be archived as {project_name}-image-{date}.tar in the {project} directory. Then, {project_name}.tar.gz and {project_name}.tar.gz.sha256 files will be generated in the same directory.

Reproduction: Want to relive your experience from years ago but can't remember what you did on that unremarkable afternoon?

# Verify the integrity of the archive file to ensure it hasn't been altered or corrupted via SHA256
sha256sum -c {project_name}.tar.gz.sha256

# Unzip the archive
tar xvf {project_name}.tar.gz
cd {project_name}

# Reload the image
make reproduce

# Start a Docker container instance
make create-container

# Attach to the container
# Use this command anytime you lose connection
make attach-container

Just run the script in the folder. It's that simple!

Known Issue

Containers Losing Access to GPUs with Error: "Failed to initialize NVML: Unknown Error"

Here's a shell script consolidating solutions for the longstanding issue with Nvidia-container-toolkit, based on the following references:

# Update /etc/nvidia-container-runtime/config.toml to set no-cgroups to false
sudo sed -i 's/^no-cgroups = true/no-cgroups = false/' /etc/nvidia-container-runtime/config.toml && echo "Set no-cgroups to false in config.toml."

# Backup /etc/docker/daemon.json if it exists, then update with new configuration
[ -f /etc/docker/daemon.json ] && sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.backup && echo "Backup created for daemon.json as daemon.json.backup."

# Add the necessary configuration to /etc/docker/daemon.json
sudo tee /etc/docker/daemon.json > /dev/null <<'EOF'
{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    },
    "exec-opts": ["native.cgroupdriver=cgroupfs"]
}
EOF

echo "Updated /etc/docker/daemon.json with a new configuration."

This script will apply the consolidated fixes by updating configurations for both Nvidia-container-runtime and Docker. Each step includes feedback messages for clarity.

Contributions

Contributions are welcome! If you have improvements or bug fixes, please feel free to fork this repository and submit a pull request.

Acknowledgements

Big shoutout to Cookiecutter and cookiecutter-docker-science for the major inspiration! Also, massive thanks to everyone contributing to the PyTorch community and the CUDA wizards at NVIDIA for the tech support.

Feel free to fork, star, and contribute! Happy coding! 🙌

About

Launch your ready-to-run PyTorch + CUDA deep learning project in just 10 minutes! 🦩

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published