-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changes based on Issue #28731 ((System.Numerics) Cross Product for Ve… #31884
Conversation
…for Vector2 and Vector4). Added a Cross (product) function for Vector2 and Vector4.
please see discussion at #28731 |
runtime/src/coreclr/src/jit/simdcodegenxarch.cpp Lines 2122 to 2404 in ffcb4b4
It's not likely you can improve on For Once you can cast runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/Matrix4x4.cs Lines 1755 to 1802 in 4f9ae42
That SIMD version might be able to wait for a followup PR, so you could move that one as-is to |
Thanks for the info. I wasn't really understanding the Intrinsic until now. The CPP code for dot product is beyond me, but it leads me to a new idea. In the vector and matrix classes there are various methods that invoke simple math operations: a * b + c * d (like a dot product) or a * b - c * d. Actually, the latter "difference of products" is really what we see alot in |
Dot product is a no-brainer for SIMD acceleration because SSE4.1 has a single instruction ( The JIT code has fallback implementations to use other SIMD instructions when SSE4.1 is not available or when the element type is not float. It works out that the SIMD implementation is actually less efficient than the scalar implementation for runtime/src/coreclr/src/jit/simdcodegenxarch.cpp Lines 2252 to 2258 in ffcb4b4
However, because maintenance of the JIT code is very expensive relative to maintenance of the managed code, the work has not been done to optimize for every case. The ability to use the individual SIMD instructions from |
@tannergooding what's the next action here? Should we close this pending the issue ref'd above? |
The referenced issue is resolved and shouldn't block optimizing the |
I'm sorry for my delay in responding. I've not had the bandwidth or resources during quarantine to respond. In re-reading @saucecontrol 's posts, I'm still wondering if the SSE4.1 Dot implementation could be used to speed up functions like
|
I wouldn't expect so for |
@micampbell, are you still working on this, or should it be closed for now? Thanks. |
@tannergooding @pgovind Is this something you'd like to pick up and finish, or should we go ahead and close it? |
I think this can be closed for now and the original issue marked up for grabs again, possibly with revisiting the names used. If you take a cross product to be essentially that Vector2This means that a "true" 2-dimensional cross product takes one input and is effectively Vector4DXMath exposes a four-dimensional cross product as taking 3 inputs so it can compute a truly perpendicular |
…ctor2 and Vector4). Added a Cross (product) function for Vector2 and Vector4.