Skip to content

[libspirv] reuse clc function in __spirv_ocl_normalize/fast_normalize #19722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 8, 2025

Conversation

wenju-he
Copy link
Contributor

@wenju-he wenju-he commented Aug 6, 2025

Before this PR, libspirv-nvptx64--nvidiacl uses __nv_rsqrtf/isinf/copysign
in these two functions, and libspirv-amdgcn--amdhsa uses __ocml_rsqrt/copysign.
We can upstream use of target built-in to CLC library:
Use of __nv_rsqrtf/isinf is to be upstreamed by llvm/llvm-project#150174.
Then the changes to libspirv-nvptx64--nvidiacl should be minimized.

Generic implementation of __clc_rsqrt is better than __ocml_rsqrt according to
llvm/llvm-project#152436.
Use of these target built-in scalarize llvm vector intrinsic. So it is
likely better to use generic implementation.
In addition, use of generic implementation aligns with OpenCL library.

Delete libspirv/generic/math/floatn.inc.
Delete unused clc/relational/floatn.inc and libspirv/generic/math/minmag.inc.

libspirv-nvptx64--nvidiacl uses __nv_rsqrtf/isinff/copysignf in these
two functions, and libspirv-amdgcn--amdhsa does something similar. To
avoid changes to these two bitcode files, the old implementation are
kept for these two targets as target-specific implementation. It might
be worth upstreaming the use of __nv_* and __ocml_* functions to clc
functions and then removing the old __spirv_ocl_* implementations.

Delete libspirv/generic/math/floatn.inc.
Delete unused clc/relational/floatn.inc and libspirv/generic/math/minmag.inc.
@wenju-he wenju-he requested a review from a team as a code owner August 6, 2025 11:14
@wenju-he wenju-he requested a review from npmiller August 6, 2025 11:14
@wenju-he wenju-he requested a review from frasercrmck August 6, 2025 11:14
Copy link
Contributor

@frasercrmck frasercrmck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right in thinking the NVPTX/AMDGPU-specific builtins are not coming from __spirv_ocl_(fast)?normalize but from builtins those are calling? It's a bit of a shame that we need to duplicate the "wrong" function to work around this, to avoid impacting the OpenCL implementations.

We obviously don't want to recompile the entire CLC library for both OpenCL and SPIR-V NVPTX builtins libraries, when only a couple of functions may change. I wonder if we could somehow swap in/out just those builtins with differing implementations but keep the 99% identical module somehow. It's just an idea at this stage so I'm not sure how it'd work.

Even so, that's future work and needn't hold up this patch.

@wenju-he
Copy link
Contributor Author

wenju-he commented Aug 7, 2025

Am I right in thinking the NVPTX/AMDGPU-specific builtins are not coming from __spirv_ocl_(fast)?normalize but from builtins those are calling?

Yes

It's a bit of a shame that we need to duplicate the "wrong" function to work around this, to avoid impacting the OpenCL implementations.

I have

Another LLVM IR change is copysign. NVPTX/AMDGPU-specific only provides scalar copysign. So it is probably not worthy to use target-specific copysign to scalarize vector llvm.copysign intrinsic.

We obviously don't want to recompile the entire CLC library for both OpenCL and SPIR-V NVPTX builtins libraries, when only a couple of functions may change. I wonder if we could somehow swap in/out just those builtins with differing implementations but keep the 99% identical module somehow. It's just an idea at this stage so I'm not sure how it'd work.

I've deleted customization files for NVPTX/AMDGPU.

@wenju-he
Copy link
Contributor Author

wenju-he commented Aug 8, 2025

@intel/llvm-gatekeepers please merge, thanks

@uditagarwal97 uditagarwal97 merged commit 735b688 into intel:sycl Aug 8, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants