-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install torch wheel from pytorch to unblock TPU CI #6691
Conversation
Is there a way to test the change before the change is merged? |
Actions > TPU Integration Test > Run workflow > then select your branch, and it'll start the run from your edited workflow FYI I opened a PR for this: #6690, which failed: https://github.com/pytorch/xla/actions/runs/8193035259 |
Building torchvision from src is working https://github.com/pytorch/xla/actions/runs/8195031279: we don't have the the |
Per offline discussion, it's less ideal to compile torchvision from src and it's better to stick with the original plan |
Current failure:
|
torch has abi disabled. I think we enabled it when building torch_xla |
Ok, some progress, we can import torch_xla atm. Now, it fails with:
after I rebase onto origin/master. It's because the companion upstream pr pytorch/pytorch#115621 was just reverted. But all others should be good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lgtm as a temporary fix if it's working. Please update to the CPU wheels before merging.
I'm interested to see if #6704 works as a more stable solution.
Why CPU wheels? The old TPU CI uses the cuda wheel and our github index page also suggest to install the cuda torch wheel. So shouldn't our CI be consistent? |
The CPU wheel is much smaller and will download more quickly. The other TPU CI should also be using CPU wheels. Our release builds will get tested against the final upstream CUDA wheel. |
If our github index page suggests our users to use cuda torch wheel, should we do the same? Or it doesn't matter much? |
These are just nightly builds, so in my mind it doesn't matter. |
Pending on a new TPU CI run. Stay tuned.. |
https://github.com/pytorch/xla/actions/runs/8240857381 Looks good |
The new TPU CI passed. Thanks for the review. |
#6681 helps uncover the silent failures in the
TPU Integration Test / tpu-test (push)
. The failure isThe failure is due to the torch wheel built by us and torchvison official wheel is not compatible with the torch wheel built by us.