Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 64 bit integer AVX512f comparisons and the intrinsics needed to test them #856

Merged
merged 12 commits into from
May 28, 2020

Conversation

Daniel-B-Smith
Copy link
Contributor

@Daniel-B-Smith Daniel-B-Smith commented May 27, 2020

This PR adds the 64 bit AVX512f integer comparison intrinsics along with set/set1 to make it usable.

Issue: #310

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @gnzlbg (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@Daniel-B-Smith
Copy link
Contributor Author

The CI errors appear to be unrelated at this point.

I put test_mm512_cmplt_epu64_mask under x86_64 because writing that test without _mm512_set_epi64 is quite hard, and stdarch-verify insisted I move the latter to x86_64. I'm happy to move things as you see fit.

@Amanieu
Copy link
Member

Amanieu commented May 28, 2020

We should probably change __mmask16 to u16 to be consistent with Clang. It's not a breaking change since the AVX512 intrinsics are not stable yet.

Otherwise the changes look good. Are you planning on adding other comparison intrinsics in this PR or in a separate one?

@Amanieu
Copy link
Member

Amanieu commented May 28, 2020

I fixed the CI failures in #857, please rebase.

@Daniel-B-Smith
Copy link
Contributor Author

Merged an updated master and the newest CI run shows a real issue. As far as I can tell, vpcmpltuq is just a synonym for vpcmpuq with the appropriate bits set to indicate less than. Unfortunately, it looks like Clang uses vpcmpltuq and MSVC uses vpcmpuq. Is there a way for me to change the test to say either instruction is fine?

I think I'll add the various 64 bit cmp methods in this PR as a reasonable stopping point. I'll send a separate PR for __mmask16 soon.

@Amanieu
Copy link
Member

Amanieu commented May 28, 2020

assert_instr uses starts_with to check the instruction so you could try passing vpcmp to assert_instr. Add a comment explaining that objdump and dumpbin disassemble the instruction differently.

@Daniel-B-Smith Daniel-B-Smith changed the title Add one AVX512f comparison and the intrinsics needed to test it Add 64 bit integer AVX512f comparisons and the intrinsics needed to test them May 28, 2020
@Daniel-B-Smith
Copy link
Contributor Author

Using the instruction prefix did indeed work.

I've added all of the 64 bit mask/no mask x signed /unsigned x lt/gt/eq intrinsics. That seems like a good stopping point for this PR.

A couple of general questions: How long does it usually take for intrinsics to get into a nightly after a PR is merged? Perhaps more loaded, is there an estimate on when some AVX512 intrinsics would be available on stable Rust?

@Amanieu
Copy link
Member

Amanieu commented May 28, 2020

How long does it usually take for intrinsics to get into a nightly after a PR is merged?

You will need to submit a PR to rust-lang/rust which updates the stdarch submodule. It will then be in the next nightly.

Perhaps more loaded, is there an estimate on when some AVX512 intrinsics would be available on stable Rust?

I would expect all of the intrinsics for a particular subset of AVX512 (eg. AVX512F, AVX512BW) before that subset can be stabilized.

@Amanieu Amanieu merged commit dc0a2b5 into rust-lang:master May 28, 2020
@Amanieu
Copy link
Member

Amanieu commented May 28, 2020

Thanks!

@Daniel-B-Smith
Copy link
Contributor Author

I would expect all of the intrinsics for a particular subset of AVX512 (eg. AVX512F, AVX512BW) before that subset can be stabilized.

This may not be the place to have this discussion, but given the sheer number of instrinsics in AVX512f, it would be nice if it weren't necessary to implement all of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants