-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support hipGraph usage in PyTorch #40
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The tensor version is now added with suffix _tensor.
SQLite does not have a dedicated boolean type and thus boolean valued performance options will be stored as 0/1 in the datbase. Therefore -DAOTRITON_BUILD_FOR_TUNING=OFF will use slightly different options than -DAOTRITON_BUILD_FOR_TUNING=ON, and it is observed this slight different has impact on the build time and may cause Triton kernel compiling timeout on certain cases.
…et1. This is to match the API used by PyTorch. TODO: make philox_offset1 optional, i.e., passing nullptr means it's zero.
xinyazhang
changed the title
Change Philox arguments from scalar to tensor
Support hipGraph usage in PyTorch
Aug 15, 2024
Thankfully the PyTorch validation is done before merging the PR. For hipGraph support PyTorch asks much more than it really needs. |
xinyazhang
force-pushed
the
xinyazhang/tensor_philox
branch
from
August 15, 2024 21:22
50146cc
to
0033481
Compare
xinyazhang
force-pushed
the
xinyazhang/tensor_philox
branch
from
August 15, 2024 21:23
0033481
to
1d04e6a
Compare
…licts with global cache
1. Profile the kernel unconditionally 2. Use PyTorch 2.5's UT scheme, i.e., only use atol and ignore rtol 3. Record the Lmax error 4. Record the estimated fudge factor that can pass the UT The output json can then be used to justify the new fudge factors for UTs.
… for MI200/300X Trying to get a balance between precision and performance with --fudge_factor_tolerance. See table_tool --help for more detailed document.
Testing result of 82e19a8 on MI300X. Mostly FP32 problems (click to expand).
|
… found at least two devices, cuda:0 and cpu!"
69aa5c6 on MI300X
|
User can turn off fp32 support
groenenboomj
approved these changes
Aug 22, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function meets PyTorch reference and tests are valid.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This simplifies hipGraph support for AOTriton.
This is a draft and shall be validated with PyTorch integration.