Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fp8e5m2 dtype #7740

Merged
merged 4 commits into from
Aug 6, 2024
Merged

Support fp8e5m2 dtype #7740

merged 4 commits into from
Aug 6, 2024

Conversation

lsy323
Copy link
Collaborator

@lsy323 lsy323 commented Jul 25, 2024

Add support for torch.fp8e5m2 dtype.

Right now torch_xla/csrc/tensor_util.cpp contains many duplicated codes, in the next PR I'll try to refactor it to make it cleaner.

The other fp8 variants and will be added in the following PRs

Test:
Added fp8 unit test

@lsy323 lsy323 requested a review from miladm July 25, 2024 00:06
@lsy323 lsy323 assigned JackCaoG and lsy323 and unassigned JackCaoG Jul 25, 2024
@lsy323 lsy323 requested a review from JackCaoG July 25, 2024 00:08
@lsy323 lsy323 added the fp8 label Jul 25, 2024
@miladm
Copy link
Collaborator

miladm commented Jul 25, 2024

  • Please include a documentation for FP8 - with example for users. Feel free to enrich the document as you add more FP8 capabilities/variants.

  • How can we add a small micro-benchmark to the documentation to illustrate the benefits of FP8 vs. the baseline BF16?

  • To my knowledge, PT AMP does not support FP8; correct?

@miladm
Copy link
Collaborator

miladm commented Aug 1, 2024

@lsy323 we should include an accuracy correctness analysis in our benchmarking and documentation.

@qihqi qihqi requested review from miladm and JackCaoG August 5, 2024 17:12
Copy link
Collaborator

@JackCaoG JackCaoG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approve to unblock, mostly lgtm we can have a follow up pr to fix review comments

@lsy323
Copy link
Collaborator Author

lsy323 commented Aug 6, 2024

Hi @JackCaoG, @miladm, thank you for reviewing! Sorry for the delayed respond for the comments.

we should include an accuracy correctness analysis in our benchmarking and documentation.

In this PR the unit test is included to verify the correctness of the fp8 dtype and matmul op. Let's have documentation in a separate PR (After e4m3 variants are ready)

@lsy323 lsy323 merged commit 1ed2626 into master Aug 6, 2024
21 of 22 checks passed
@lsy323 lsy323 deleted the lsiyuan/add-fp8e5m2 branch August 6, 2024 17:35
@lsy323 lsy323 mentioned this pull request Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants