Skip to content
This repository has been archived by the owner on Feb 12, 2023. It is now read-only.
/ dev.docker Public archive

Container setup for Deep Learning model training

Notifications You must be signed in to change notification settings

NoUnique/dev.docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Docker-based Deep Learning Model Development Environment

This repository contains a Docker-based development environment for deep learning model development.

It can be easily used just by adding it as a submodule under the project folder of the target code.

Only one container is created and the name is automatically generated.


Prerequisites

Tested environment

  • Ubuntu LTS (16.04 or 18.04)
  • docker-ce (>= 18.09)
  • docker-compose (>= 1.25.4) - This script will automatically upgrade docker-compose if necessary
  • nvidia-docker2 (>= 2.0.3)

How to use

  • Build docker image

    {PJT_DIR}$ ./docker/compose.sh -b

    -b : build an image
    -r : run a container
    -s : connect to shell(bash)
    -k : kill container (attach and kill)
    -d : down container (kill container and remove container, network and volumes)
    -t, --tensorboard : run tensorboard (default path: /home/${USER}/dev/checkpoints, port: 6006)
    -j, --jupyter : run jupyter notebook server (default port: 8888)
    --no-cache : build image from scratch(use no cache)

    This script makes ONLY 1 CONTAINER

  • Attach the container using docker-compose

    {PJT_DIR}$ ./docker/compose.sh -s
  • This script supports multiple argument

    {PJT_DIR}$ bash ./compose.sh -brs

    build -> run -> attach to shell (Automatically executed sequentially)

  • To build specific service

    {PJT_DIR}$ bash ./compose.sh -bj

    build jupyter image -> run jupyter service (Automatically executed sequentially)


example)
ubuntu user ID: nounique
directory name : _GIT_REPO
location: _GIT_REPO/docker
buuild & run result:

$ docker ps
CONTAINER ID  IMAGE             COMMAND      NAMES
78487a99355e  gitrepo:nounique  "/bin/bash"  nounique_gitrepo_dev_1

Installation (Docker on host)

  1. Remove default docker packages(old versions)

    sudo apt-get purge docker \
                       docker-engine \
                       docker.io \
                       lxc-docker 
  2. Install required packages for installing docker

    sudo apt-get install curl \
                         apt-transport-https \
                         ca-certificates \
                         software-properties-common
  3. Import docker GPG key to verified packages signiture

    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
    sudo apt-key fingerprint 0EBFCD88
  4. Add docker repository

    sudo add-apt-repository \
         "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
         $(lsb_release -cs) stable"
    sudo apt-get update
  5. Install docker(docker-ce) and docker-compose

    sudo apt-get install docker-ce \
                         docker-ce-cli \
                         containerd.io \
                         docker-compose

    Give user a root permission

    sudo usermod -aG docker $USER
  6. Install nvidia-docker2 Remove nvidia-docker1.0

    docker volume ls -q -f driver=nvidia-docker | \
    xargs -r -I{} -n1 docker ps -q -a -f volume={} | \
    xargs -r docker rm -f
    sudo apt-get purge nvidia-docker

    Add nvidia-docker repository

    curl -sL https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
    dist=$(. /etc/os-release;echo $ID$VERSION_ID)
    curl -sL https://nvidia.github.io/nvidia-docker/$dist/nvidia-docker.list | \
    sudo tee /etc/apt/sources.list.d/nvidia-docker.list
    sudo apt-get update

    Install nvidia-docker2

    sudo apt-get install nvidia-docker2
    sudo pkill -SIGHUP dockerd

    Set nvidia-docker as default-runtime of docker

    sudo vi /etc/docker/daemon.json
    {
        "default-runtime": "nvidia",
        "runtimes": {
            "nvidia": {
                "path": "nvidia-container-runtime",
                "runtimeArgs": []
            }
        }
    }
    sudo systemctl restart docker
    sudo chown "$USER":"$USER" /home/"$USER"/.docker -R
    sudo chmod g+rwx "/home/$USER/.docker" -R

    Re-login is required

    Test nvidia-docker

    docker run --rm -it nvidia/cuda nvidia-smi

About

Container setup for Deep Learning model training

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages