-
Notifications
You must be signed in to change notification settings - Fork 797
[libspirv] reuse clc function in __spirv_ocl_normalize/fast_normalize #19722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
libspirv-nvptx64--nvidiacl uses __nv_rsqrtf/isinff/copysignf in these two functions, and libspirv-amdgcn--amdhsa does something similar. To avoid changes to these two bitcode files, the old implementation are kept for these two targets as target-specific implementation. It might be worth upstreaming the use of __nv_* and __ocml_* functions to clc functions and then removing the old __spirv_ocl_* implementations. Delete libspirv/generic/math/floatn.inc. Delete unused clc/relational/floatn.inc and libspirv/generic/math/minmag.inc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right in thinking the NVPTX/AMDGPU-specific builtins are not coming from __spirv_ocl_(fast)?normalize
but from builtins those are calling? It's a bit of a shame that we need to duplicate the "wrong" function to work around this, to avoid impacting the OpenCL implementations.
We obviously don't want to recompile the entire CLC library for both OpenCL and SPIR-V NVPTX builtins libraries, when only a couple of functions may change. I wonder if we could somehow swap in/out just those builtins with differing implementations but keep the 99% identical module somehow. It's just an idea at this stage so I'm not sure how it'd work.
Even so, that's future work and needn't hold up this patch.
Yes
I have
Another LLVM IR change is copysign. NVPTX/AMDGPU-specific only provides scalar copysign. So it is probably not worthy to use target-specific copysign to scalarize vector llvm.copysign intrinsic.
I've deleted customization files for NVPTX/AMDGPU. |
@intel/llvm-gatekeepers please merge, thanks |
Before this PR, libspirv-nvptx64--nvidiacl uses __nv_rsqrtf/isinf/copysign
in these two functions, and libspirv-amdgcn--amdhsa uses __ocml_rsqrt/copysign.
We can upstream use of target built-in to CLC library:
Use of __nv_rsqrtf/isinf is to be upstreamed by llvm/llvm-project#150174.
Then the changes to libspirv-nvptx64--nvidiacl should be minimized.
Generic implementation of __clc_rsqrt is better than __ocml_rsqrt according to
llvm/llvm-project#152436.
Use of these target built-in scalarize llvm vector intrinsic. So it is
likely better to use generic implementation.
In addition, use of generic implementation aligns with OpenCL library.
Delete libspirv/generic/math/floatn.inc.
Delete unused clc/relational/floatn.inc and libspirv/generic/math/minmag.inc.