-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add/fix patches for PyTorch 1.13.1 w/ foss/2022a #18371
add/fix patches for PyTorch 1.13.1 w/ foss/2022a #18371
Conversation
Test report by @Flamefire |
Test report by @Flamefire |
@boegelbot please test @ generoso |
@boegel: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1659852530 processed Message to humans: this is just bookkeeping information for me, |
@boegelbot please test @ jsc-zen2 |
@boegel: Request for testing this PR well received on jsczen2l1.int.jsc-zen2.easybuild-test.cluster PR test command '
Test results coming soon (I hope)... - notification for comment with ID 1660003521 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot |
Test report by @boegelbot |
Test report by @boegel |
@boegel Could you attach the log please as your build seems to be the only failing one ? Quick guess: Have you reinstalled GCC and OpenBLAS with #16411? A bit confused because it seems to have worked for you in #17155 and the patches only relax stuff without the possibility to introduce failures as far as I can tell. |
@Flamefire The Here's the relevant part of the log for the failing
I'm happy to merge this and follow up on this problem in another PR (one step at a time, especially with PyTorch). |
Sounds like a good idea. Especially after now I've found yet another issue: Currently testing the updated patch and it likely makes sense to add them to all ECs in a single PR especially as that only affects POWER. |
Test report by @SebastianAchilles |
@SebastianAchilles Does your GCCcore module include the |
Merging this as-is, since it's a step forward, more fixes can be done in a follow-up PR |
Going in, thanks @Flamefire! |
Yes, the installed GCCcore includes |
Strange. I also ran this multiple times without issues, both completely and only this test. Might want to keep that in mind for later. If you still have the log, can you post the exact name of the subtest in |
@casparvl You download to |
Unfortunately I didn't have the logs anymore. I rebuilt this MR 5 times again on my machines and the test was successful each time. |
(created using
eb --new-pr
)test_quantization
have the same cause as those intest_ao_sparsity
test_ops
failure persisted as the forward-port of the patch doesn't work, added the commit from my upstream PR instead*-fix-fsdp-fp16-test.patch
was outdated due to changes upstream. Although it applied it didn't have the expected effect