-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jit] Add more SSE2 opcodes. Enable SSE2. #33465
Conversation
cefc15b
to
0981ef8
Compare
Looks like the following are missing:
LGTM since IsSupported is disabled |
@tannergooding do you happen to know how coreclr handles intrinsics which require constant args and a variable is passed instead? e.g. Vector128<int> Foo(Vector128<int> lhs, byte rhs)
{
return Sse2.Shuffle(lhs, rhs);
} |
If the value is constant, we just return a When it gets to codegen, we then check if the value is The fallback is slow, but it ensures things will work in various edge scenarios and devs using HWIntrinsics should be profiling so they'll catch the issue. We've previously discussed an analyzer as something that could help flag non-constant inputs to the user as well, but that hasn't been implemented yet. We also have an issue (#11062) tracking us delaying the CC. @CarolEidt, @echesakovMSFT |
Thanks, so it's basically the same as what Zoltan did (not sure about the 256 limit). I see that all /// <summary>
/// int _mm_extract_epi16 (__m128i a, int immediate)
/// PEXTRW reg, xmm, imm8
/// </summary>
public static ushort Extract(Vector128<ushort> value, byte index) => Extract(value, index); And I guess we can always emit a jump table even for a const input -- LLVM will optimize it anyway (but probably makes sense to handle const to reduce compilation time) |
Yes, because the underlying instruction only accepts an |
dotnet/runtime#33465 implemented 99% of them but a few were missing. This PR implements the missing pieces and unlocks `Sse2.IsSupported`. @vargaz I hope I didn't step on your toe (feel free to close if you already have it locally).
dotnet/runtime#33465 implemented 99% of them but a few were missing. This PR implements the missing pieces and unlocks `Sse2.IsSupported`. @vargaz I hope I didn't step on your toe (feel free to close if you already have it locally). Co-authored-by: EgorBo <EgorBo@users.noreply.github.com>
No description provided.