Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition to Pytorch 2.0 #1229

Closed
ourownstory opened this issue Mar 22, 2023 · 11 comments
Closed

Transition to Pytorch 2.0 #1229

ourownstory opened this issue Mar 22, 2023 · 11 comments
Assignees

Comments

@ourownstory
Copy link
Owner

No description provided.

@ourownstory ourownstory added this to V1.0 Mar 22, 2023
@ourownstory ourownstory converted this from a draft issue Mar 22, 2023
@ourownstory ourownstory added this to the Release 0.5.4 milestone Mar 22, 2023
@Kevin-Chen0
Copy link
Collaborator

Should NP be transitioning to Pytorch Lightning 2.0?

https://lightning.ai/pages/blog/training-compiled-pytorch-2.0-with-pytorch-lightning

@noxan
Copy link
Collaborator

noxan commented Apr 4, 2023

It is a bit tricky to get a good overview as one issue leads to another.
We definitely have to adjust for the trainer and auto tuner separation

@noxan
Copy link
Collaborator

noxan commented Apr 4, 2023

There is a migration guide for lightning: https://lightning.ai/docs/pytorch/latest/upgrade/from_1_9.html

@noxan noxan removed their assignment Apr 7, 2023
@cooperability
Copy link
Collaborator

I'm helping with the torch 2.0 transition! Just collecting a few useful resources here so I can assign myself to this issue:
#1202
#1240

@hxyue1
Copy link
Collaborator

hxyue1 commented Jun 16, 2023

Using poetry install with torch = "^2.0.0" will by default install the gpu version. For machines which do not have a gpu available, this will lead to an immediate crash upon importing torch

Python 3.8.16 (default, Jun 12 2023, 18:09:05) 

[GCC 11.2.0] :: Anaconda, Inc. on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> import torch

Traceback (most recent call last):

  File "/home/hxyue1/anaconda3/envs/torch2/lib/python3.8/site-packages/torch/__init__.py", line 168, in _load_global_deps

    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)

  File "/home/hxyue1/anaconda3/envs/torch2/lib/python3.8/ctypes/__init__.py", line 373, in __init__

    self._handle = _dlopen(self._name, mode)

OSError: libcurand.so.10: cannot open shared object file: No such file or directory



During handling of the above exception, another exception occurred:



Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "/home/hxyue1/anaconda3/envs/torch2/lib/python3.8/site-packages/torch/__init__.py", line 228, in <module>

    _load_global_deps()

  File "/home/hxyue1/anaconda3/envs/torch2/lib/python3.8/site-packages/torch/__init__.py", line 189, in _load_global_deps

    _preload_cuda_deps(lib_folder, lib_name)

  File "/home/hxyue1/anaconda3/envs/torch2/lib/python3.8/site-packages/torch/__init__.py", line 154, in _preload_cuda_deps

    raise ValueError(f"{lib_name} not found in the system path {sys.path}")

ValueError: libcublas.so.*[0-9] not found in the system path ['', '/home/hxyue1/anaconda3/envs/torch2/lib/python38.zip', '/home/hxyue1/anaconda3/envs/torch2/lib/python3.8', '/home/hxyue1/anaconda3/envs/torch2/lib/python3.8/lib-dynload', '/home/hxyue1/anaconda3/envs/torch2/lib/python3.8/site-packages', '/home/hxyue1/Documents/np/neural_prophet']

This can be circumvented by installing the cpu only version

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

As far as I know, there is no current poetry native solution that allows you to choose which one, this is an issue which has been identified by other users as well:
python-poetry/poetry#6409
https://stackoverflow.com/questions/59158044/poetry-and-pytorch

This however is not an issue for pytorch versions before 2.0. While it seems like the GPU version is installed, this will still run fine on machines without CUDA.

@hxyue1
Copy link
Collaborator

hxyue1 commented Jun 20, 2023

I've done some further testing, but unfortunately it does not seem possible to reliably install the CPU version of pytorch using poetry for versions 2.0.0 and above.

However, if we're willing to use pip on top of poetry, installing the CPU version of torch is pretty straightforward with the command pip3 install torch --index-url https://download.pytorch.org/whl/cpu. I've run this installation process multiple times and the tests will pass without fail.

However, this makes the installation process more complicated than it otherwise would have been with torch v1. Feedback and ideas would be welcome!

@noxan
Copy link
Collaborator

noxan commented Aug 16, 2023

@hxyue1 thanks for flagging this. I was a bit confused already why you opted for the cpu version in your pull request.

You are always free to choose which package manager you use, so if pip works, that's great.

I would not force the cpu version as a dependency for our package, as others might have gpu support and prefer to use it.

I am on macOS with an Intel chip with no accelerator, so it works for me. How bad is it on other systems? I assume the problem only arises if you use poetry yourself, right? Else I'd propose we migrate to PyTorch 2.0 and reference this issue on the release notes (maybe it will already be resolved by the time we release our 1.0). Happy to hear your thoughts.

@hxyue1
Copy link
Collaborator

hxyue1 commented Aug 21, 2023

@noxan as far as I'm aware, the CI tests use poetry which by default will install the GPU version on Ubuntu. This unfortunately causes the tests to fail because the Github runners are CPU only (as far as I know). Which is why as a workaround I attempted to force a CPU only install for the CI tests.

Would it be possible to change the github actions to use pip instead of Poetry?

@noxan
Copy link
Collaborator

noxan commented Aug 21, 2023

@hxyue1 Ah I see you were talking about CI. On my pull request the test do not fail on Ubuntu 🤔

@hxyue1
Copy link
Collaborator

hxyue1 commented Aug 21, 2023

Great, that's fantastic! I have no idea what I'm doing wrong then...

@leoniewgnr
Copy link
Collaborator

done in #1404

@github-project-automation github-project-automation bot moved this from Simplify & Clean to Done in V1.0 Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

When branches are created from issues, their pull requests are automatically linked.

6 participants