-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVPTX: Add f16 SIMD intrinsics #1626
NVPTX: Add f16 SIMD intrinsics #1626
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Amanieu (or someone else) some time within the next two weeks. |
5d937f5
to
dd7e165
Compare
dd7e165
to
21ff034
Compare
Thanks for reviewing! I have fixed the two comments from @Amanieu I have circumvented the naming issue around min/minimum brought up by @CryZe by instead implementing the intrinsics that makes sense to have named as I have also re-run the tests with my branch containing I see that there might be yet another slight naming issue. I have named the functions as |
CUDA seems to have an API for |
I see that the names are consistent with the ptx ISA for functions. I will change all the function names to be consistent with this naming (both CUDA and PTX ISA). I will refrain from adopting the leading double underscore The part I'm a bit more unsure about is naming the type |
21ff034
to
79cb9fa
Compare
I updated the PR a little while ago. I also re-run the rustc assembly tests I have in my branch to verify the correct instructions are still being produced. I also ported a much used function in our (my workplace) internal code and verified that tests were passing. (And there was also a notable performance boost on the relevant benchmarks). And with that, I think this one should be ready for merging. |
79cb9fa
to
4d6fed8
Compare
This implements parts of the
f16x2
intrinsics as discussed in rust-lang/rust#125440 (comment)It's unfortunately not possible to test that these intrinsics produce the correct instructions with the current test infrastructure. This was discussed on zulip
I have created assembly tests in my own branch of
rust-lang
repo that verifies the correct instructions are used. I have added#[assert_instr]
attributes nonetheless. I'm not sure if it's desirable to follow up onrust-lang
with a PR including the tests or not?