[libspirv] reuse clc function in __spirv_ocl_normalize/fast_normalize #19722

wenju-he · 2025-08-06T11:14:07Z

Before this PR, libspirv-nvptx64--nvidiacl uses __nv_rsqrtf/isinf/copysign
in these two functions, and libspirv-amdgcn--amdhsa uses __ocml_rsqrt/copysign.
We can upstream use of target built-in to CLC library:
Use of __nv_rsqrtf/isinf is to be upstreamed by llvm/llvm-project#150174.
Then the changes to libspirv-nvptx64--nvidiacl should be minimized.

Generic implementation of __clc_rsqrt is better than __ocml_rsqrt according to
llvm/llvm-project#152436.
Use of these target built-in scalarize llvm vector intrinsic. So it is
likely better to use generic implementation.
In addition, use of generic implementation aligns with OpenCL library.

Delete libspirv/generic/math/floatn.inc.
Delete unused clc/relational/floatn.inc and libspirv/generic/math/minmag.inc.

libspirv-nvptx64--nvidiacl uses __nv_rsqrtf/isinff/copysignf in these two functions, and libspirv-amdgcn--amdhsa does something similar. To avoid changes to these two bitcode files, the old implementation are kept for these two targets as target-specific implementation. It might be worth upstreaming the use of __nv_* and __ocml_* functions to clc functions and then removing the old __spirv_ocl_* implementations. Delete libspirv/generic/math/floatn.inc. Delete unused clc/relational/floatn.inc and libspirv/generic/math/minmag.inc.

frasercrmck

Am I right in thinking the NVPTX/AMDGPU-specific builtins are not coming from __spirv_ocl_(fast)?normalize but from builtins those are calling? It's a bit of a shame that we need to duplicate the "wrong" function to work around this, to avoid impacting the OpenCL implementations.

We obviously don't want to recompile the entire CLC library for both OpenCL and SPIR-V NVPTX builtins libraries, when only a couple of functions may change. I wonder if we could somehow swap in/out just those builtins with differing implementations but keep the 99% identical module somehow. It's just an idea at this stage so I'm not sure how it'd work.

Even so, that's future work and needn't hold up this patch.

wenju-he · 2025-08-07T05:20:48Z

Am I right in thinking the NVPTX/AMDGPU-specific builtins are not coming from __spirv_ocl_(fast)?normalize but from builtins those are calling?

Yes

It's a bit of a shame that we need to duplicate the "wrong" function to work around this, to avoid impacting the OpenCL implementations.

I have

created PR [libclc] Implement __clc_rsqrt with __ocml_rsqrt_* functions llvm/llvm-project#152436 to upstream use of _ocml_rsqrt. The PR is closed because use of _ocml_rsqrt should be deleted.
updated PR [libclc] Implement clc_log/sinpi/sqrt with __nv_* functions llvm/llvm-project#150174 to __clc_rsqrt/isinf with _nv* functions.

Another LLVM IR change is copysign. NVPTX/AMDGPU-specific only provides scalar copysign. So it is probably not worthy to use target-specific copysign to scalarize vector llvm.copysign intrinsic.

We obviously don't want to recompile the entire CLC library for both OpenCL and SPIR-V NVPTX builtins libraries, when only a couple of functions may change. I wonder if we could somehow swap in/out just those builtins with differing implementations but keep the 99% identical module somehow. It's just an idea at this stage so I'm not sure how it'd work.

I've deleted customization files for NVPTX/AMDGPU.

wenju-he · 2025-08-08T04:20:40Z

@intel/llvm-gatekeepers please merge, thanks

wenju-he requested a review from a team as a code owner August 6, 2025 11:14

wenju-he requested a review from npmiller August 6, 2025 11:14

wenju-he temporarily deployed to WindowsCILock August 6, 2025 11:14 — with GitHub Actions Inactive

wenju-he requested a review from frasercrmck August 6, 2025 11:14

wenju-he temporarily deployed to WindowsCILock August 6, 2025 11:42 — with GitHub Actions Inactive

frasercrmck approved these changes Aug 6, 2025

View reviewed changes

wenju-he added 2 commits August 7, 2025 04:05

remove customization for amdgcn-amdhsa and ptx-nvidiacl

53936f1

remove __FLOAT_ONLY from normalize.cl

7800def

wenju-he temporarily deployed to WindowsCILock August 7, 2025 05:08 — with GitHub Actions Inactive

wenju-he requested a review from frasercrmck August 7, 2025 05:20

wenju-he temporarily deployed to WindowsCILock August 7, 2025 05:33 — with GitHub Actions Inactive

frasercrmck approved these changes Aug 7, 2025

View reviewed changes

uditagarwal97 merged commit 735b688 into intel:sycl Aug 8, 2025
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[libspirv] reuse clc function in __spirv_ocl_normalize/fast_normalize #19722

[libspirv] reuse clc function in __spirv_ocl_normalize/fast_normalize #19722

wenju-he commented Aug 6, 2025 •

edited

Loading

Uh oh!

frasercrmck left a comment

Uh oh!

wenju-he commented Aug 7, 2025 •

edited

Loading

Uh oh!

wenju-he commented Aug 8, 2025

Uh oh!

Uh oh!

Uh oh!

[libspirv] reuse clc function in __spirv_ocl_normalize/fast_normalize #19722

[libspirv] reuse clc function in __spirv_ocl_normalize/fast_normalize #19722

Conversation

wenju-he commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frasercrmck left a comment

Choose a reason for hiding this comment

Uh oh!

wenju-he commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wenju-he commented Aug 8, 2025

Uh oh!

Uh oh!

Uh oh!

wenju-he commented Aug 6, 2025 •

edited

Loading

wenju-he commented Aug 7, 2025 •

edited

Loading