Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DockerFile for PyTorch? #745

Closed
jzf2101 opened this issue Nov 1, 2018 · 11 comments
Closed

DockerFile for PyTorch? #745

jzf2101 opened this issue Nov 1, 2018 · 11 comments
Labels
type:Enhancement A proposed enhancement to the docker images

Comments

@jzf2101
Copy link
Contributor

jzf2101 commented Nov 1, 2018

@parente : @koustuvsinha are looking at jupyterhub/zero-to-jupyterhub-k8s#994 and @consideRatio 's dockerfile and we need to tweak both to get a pytorch image perhaps we can also add this when it's done?

@jzf2101 jzf2101 changed the title DockerFile fro PyTorch? DockerFile for PyTorch? Nov 1, 2018
@jzf2101 jzf2101 closed this as completed Nov 1, 2018
@jzf2101 jzf2101 reopened this Nov 1, 2018
@jzf2101 jzf2101 added the type:Enhancement A proposed enhancement to the docker images label Nov 1, 2018
@parente
Copy link
Member

parente commented Nov 1, 2018

Hi @jzf2101 . We've been trying to build, test, and maintain fewer image in this one uber-repo over time and encouraging more links to images based on the ones here managed elsewhere. Ways to go about it (ordered by my preference FWIW) would be:

  1. Start a new repo / image based on the info here: https://jupyter-docker-stacks.readthedocs.io/en/latest/contributing/stacks.html
  2. Repurpose the jupyter/tensorflow-notebook to be jupyter/deeplearning-notebook and add it there.
  3. Update the automation here and on Docker Cloud to build and maintain a new jupyter/pytorch-notebook image in this repo.

@jzf2101
Copy link
Contributor Author

jzf2101 commented Nov 3, 2018

I'm fine with option 1 as long as we presume this doesn't imply we're providing preferential support for TF over Pytorch. Both seem quite popular

@consideRatio
Copy link
Collaborator

@jzf2101 I added two bullet points to the issue i posted previously:

@parente thanks for the input on this!


@jzf2101 I could help out on option 1!

@parente
Copy link
Member

parente commented Nov 3, 2018

I'm fine with option 1 as long as we presume this doesn't imply we're providing preferential support for TF over Pytorch. Both seem quite popular

Agreed. The tensorflow-notebook image predates our push to include fewer images in this single repo. Perhaps we should consider moving it to its own repo too. It's of limited use because it is a CPU-only image.

On that topic: A handful of users have requested that we update it (tensorflow-notebook) to include the CUDA drivers. After consulting with the Jupyter Steering Council, we decided that we did not want to introduce commercial software into the images we maintain. See #516 for a brief history. I don't know how that affects plans for PyTorch.

@consideRatio
Copy link
Collaborator

When you write conda install tensorflow-gpu, CUDA is installed automatically as a dependency through cudatoolkit I think. I manually pinned it while installing tensorflow-gpu though to avoid getting the latest version as that is coupled with a too new CUDA version that requires more modern graphics cards than I were utilizing on google cloud (K80, P100).

Does this mean the CUDA licence has changed in some way, as it is available simple enough with a conda install command? If not, I'm wondering if we are keeping track of all kinds of things installed as dependencies through conda install?

I supressed confirmations etc during install with --quiet --yes during the conda install though, so I may have accepted something without knowing it. Hmmm...

@parente
Copy link
Member

parente commented Nov 6, 2018

Does this mean the CUDA licence has changed in some way, as it is available simple enough with a conda install command?

It does look like the EULA has changed since the last time I looked: https://docs.nvidia.com/cuda/eula/index.html#distribution-requirements

IANAL so I won't attempt to interpret the changes. That said, the original decision was about not including non-open source software in the images here, not the details of the CUDA license in particular.

If not, I'm wondering if we are keeping track of all kinds of things installed as dependencies through conda install?

We purposely installed tensorflow but not tensorflow-gpu to avoid pulling the CUDA lib when we originally set up the image definition.

https://github.com/jupyter/docker-stacks/blob/master/tensorflow-notebook/Dockerfile#L10

But that begs the valid question of what else might we be pulling in.

@ericdill
Copy link
Contributor

ericdill commented Nov 6, 2018

@consideRatio @parente -- Here is my understanding from our open source team at Anaconda regarding CUDA, nvidia and redistribution.

• EULA lists basically all the shared libraries as redistributable, but not compiler, debugger, or profiler
• cuDNN is not part of CUDA toolkit and is not technically redistributable
• NVIDIA has an exception clause for Docker images based on their images / Dockerfiles. It's not totally unambiguous what "based on" means, but the rough idea seems to be that you can start from their docker images, layer in new things and still be able to redistribute them.

There's also a nice discussion over on conda-forge here

@jzf2101
Copy link
Contributor Author

jzf2101 commented Nov 24, 2018

@alexbw followed up on this issue looking at the underlying conda build for PyTorch conda-forge/conda-forge.github.io#63 (comment) @scopatz are you saying that PyTorch likely has a license that allows people to use these conda packages to install CUDA with PyTorch wherever?

@scopatz
Copy link

scopatz commented Nov 28, 2018

Nope, users are not allowed to install cudatoolkit anywhere that is redistributed

@beniz beniz mentioned this issue Feb 16, 2019
@beniz
Copy link

beniz commented Feb 16, 2019

See here for a Pytorch GPU recipe: #706 (comment)

The Tensorflow + Pytorch recipe is here: https://github.com/jolibrain/docker-stacks/tree/master/jupyter-dd-notebook-gpu

@parente
Copy link
Member

parente commented Mar 31, 2019

This issue has been idle for some time now so I'm going to close it. If anyone would like to submit a PR to the recipes or community images doc pages linking to information about GPU-enabled images, I'll be happy to review it. For now, the decision stands that we do not wish to deal with the distribution of non-open source software in these images.

I'm not sure if it's been said yet, so I'll point out that it might be much easier to direct users to images like https://hub.docker.com/r/pytorch/pytorch and https://hub.docker.com/r/tensorflow/tensorflow/ and help users add Jupyter components to them instead of working the other way around (i.e., starting with Jupyter images and adding deep learning frameworks with GPU acceleration to them). I'd be fine with the recipes page including such instructions if someone wants to figure out that approach and send a PR.

@parente parente closed this as completed Mar 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:Enhancement A proposed enhancement to the docker images
Projects
None yet
Development

No branches or pull requests

6 participants