-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add jupyter/pytorch-notebook #1936
Conversation
We should consider alternatives here, for example, to create |
It's possible to install pytorch in/on top of the existing tensorflow image, but this makes this image extra large. I don't think many users will use PyTorch and Tensorflow at the same time hence why I created an extra image instead. |
The CI builds show the following image sizes: scipy-notebook: 4.06GB So for an unified tensorflow+pytorch notebook we can expect the image to be around 6.43GB. I leave it up to the community to decide what the better option is. |
@consideRatio @minrk how do you feel about this? |
I think its critical that this project has a policy on whats to be considered in scope for the project. This opinion has evolved based on another project. In jupyterhub/oauthenticator we said all kinds of new identity provider specific jupyterhub authenticator classes was accepted, so the project grew and grew but ended up unsustainable to maintain. At some point we decided to stop accepting new authenticators and removed some that could migrate to another good option easily. I suggest a policy discussion is started in a dedicated issue. I'm personally conservative, looking to ensure that what we have is very clearly sustainable to maintain and only opting to maintain anything considered providing sufficient value for users. I'm on vacation since monday this week, I'm back in two weeks! |
I'm also on vacation for the rest of the month, but don't have a good sense of how to draw lines, and that's made it extremely hard to maintain this repo. This is a great PR, and the case for transitioning from tensorflow to torch is sensible, but I'll defer to @mathbunnyru who's done most of the heavy maintenance lifting here. I do think that any added image should need to at least come with a commitment to maintain it, not adding work to already overloaded maintenance. I do expect as soon as we have a tensorflow-cpu image, folks will want a tensorflow-gpu image, etc. |
@consideRatio @minrk have a nice vacation ❤️ Thanks for your input, I will try to think how we can proceed here.
I think you mean |
I'm sorry for creating too many commits - I've just move all images to Even if we don't merge this PR (we still have no solution for this problem), we will have an example of a good PR how one can introduce a new image. |
We now have a policy for merging new images and adding new packages 🎉 @mathbunnyru, @consideRatio, @yuvipanda, and @manics, please vote 👍 to accept this change and 👎 not to accept it (use a reaction to this message) We can have a discussion until the deadline, so please express your opinions. |
I haven't reviewed the implementation in the PR but given that there's a Tensorflow image I think it's reasonable to add a Pytorch one if someone's willing to maintain it. However my preferred option is as @mathbunnyru suggested to have a multi-framework image, ideally replacing the tensorflow image.. The datascience image has both R and Python which are unlikely to be used simultaneously so having multiple ML frameworks in an single image seems reasonable. It'll significantly increase the image size, but the datascience packages and ML frameworks are already quite large, and I think the benefits of not adding yet another image are worth the increase in size. I'm 👍 on a multi-framework image, and neutral on a pytorch specific image. |
For me, deciding on this is a matter of prioritization of available maintenance capacity. I'm a bit concerned by having any image dedicated to any library due to the maintenance load it could lead to long term. At the same time, the libraries related to GPU like tensorflow, keras, pytorch etc has from what i recall been a hassle to install - and to install all at the same time may be too hard if we would aim for an all-in-one image, making maintenance harder and not easier. Also, of course a larger image size. This is not an easy call, i dont know =\ I think maybe images should be in an alpha/beta stage of maintenance before they are comitted to long term? If they turn out too complicated to maintain, they could be dropped etc? |
Well, I think combining our ideas, I suggest we do the following:
Does it sound ok? |
@consideRatio does the previous suggestion sound ok to you? |
@twalcari may I ask you to update this PR, which will add |
@twalcari thank you for your contribution! I'm sorry it took so long, but we established a voting group and some processes during this PR, so it should be significantly better next time we're asked to do something similar. Also, I updated lots of workflows/actions during this PR (I am done for now). I hope my merges of |
🎉! I completely understand that this took some time to figure out. I'm glad that this helped formalize the process for accepting new notebooks. I think new notebooks should only be allowed here very sparingly but PyTorch was one of the exceptions 🙂 In my opinion, only 'ecosystems' as well established as R / Julia / PyTorch / TensorFlow should be considered. Also many thanks to you for keeping this PR up-to-date throughout the process! |
Describe your changes
As PyTorch has grown more popular than TensorFlow over the last few years, it seems odd that we have no official PyTorch image.
cfr. Google Trends of tensorflow (blue line) vs. pytorch (red line):
I've been supplying a PyTorch image to my researchers at Ghent University, which has proven to been very popular. I'm now proposing to upstream this image (be it the CPU-only version instead of the GPU-enabled version at Ghent University).
Note that the installation is done with
pip
instead ofconda
: when using the latter conflicts arise with thenomkl
package that is installed.I've based myself on #1926 for this PR so that I hopefully have not missed any location where I needed to add changes.
I have not been able to test this PR against the aarch64 architecture.
Checklist (especially for first-time contributors)