Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spirv_hip_fp16.h header file updates #896

Merged
merged 5 commits into from
Jul 30, 2024
Merged

spirv_hip_fp16.h header file updates #896

merged 5 commits into from
Jul 30, 2024

Conversation

jjennychen
Copy link
Collaborator

@jjennychen jjennychen commented Jul 29, 2024

Partially updated the spirv_hip_fp16.h based on rocm release 6.1.0 and added the needed ocml functions to spirv_math_fwd.h (tested the ocml functions and worked well).
The parts of spirv_hip_fp16.h left unmodified includes:

  • Updates in macros defined in other header files
    • Left unchanged as it may impact multiple other header files
    • For example, #if __HIP_CLANG_ONLY__ was updated to #if defined(__clang__) && defined(__HIP__) in the new release
  • #includes for AMD-specific header files
    • Did not include the header files such as <hip/amd_detail/amd_hip_common.h> and "hip/amd_detail/host_defines.h" (it seems like we do have our host_defines.h in chipStar, but it seems to still be in progress)
  • Newly added shuffle functions
    • Did not add the shuffle functions from new header in as they are (to my knowledge) hardware-dependent and may need our own implementation
    • The shuffle functions are the following:
        __device__
        inline
        __half __shfl(__half var, int src_lane, int width = warpSize) {
           union { int i; __half h; } tmp; tmp.h = var;
           tmp.i = __shfl(tmp.i, src_lane, width);
           return tmp.h;
        }
        __device__
        inline
        __half2 __shfl(__half2 var, int src_lane, int width = warpSize) {
           union { int i; __half2 h; } tmp; tmp.h = var;
           tmp.i = __shfl(tmp.i, src_lane, width);
           return tmp.h;
        }
        __device__
        inline
        __half __shfl_up(__half var, unsigned int lane_delta, int width = warpSize) {
           union { int i; __half h; } tmp; tmp.h = var;
           tmp.i = __shfl_up(tmp.i, lane_delta, width);
           return tmp.h;
        }
        __device__
        inline
         __half2 __shfl_up(__half2 var, unsigned int lane_delta, int width = warpSize) {
            union { int i; __half2 h; } tmp; tmp.h = var;
            tmp.i = __shfl_up(tmp.i, lane_delta, width);
            return tmp.h;
         }
         __device__
         inline
         __half __shfl_down(__half var, unsigned int lane_delta, int width = warpSize) {
            union { int i; __half h; } tmp; tmp.h = var;
            tmp.i = __shfl_down(tmp.i, lane_delta, width);
            return tmp.h;
         }
         __device__
         inline
         __half2 __shfl_down(__half2 var, unsigned int lane_delta, int width = warpSize) {
            union { int i; __half2 h; } tmp; tmp.h = var;
            tmp.i = __shfl_down(tmp.i, lane_delta, width);
            return tmp.h;
         }
         __device__
         inline
         __half __shfl_xor(__half var,  int lane_mask, int width = warpSize) {
            union { int i; __half h; } tmp; tmp.h = var;
            tmp.i = __shfl_xor(tmp.i, lane_mask, width);
            return tmp.h;
         }
         __device__
         inline
          __half2 __shfl_xor(__half2 var,  int lane_mask, int width = warpSize) {
             union { int i; __half2 h; } tmp; tmp.h = var;
             tmp.i = __shfl_xor(tmp.i, lane_mask, width);
             return tmp.h;
         }

(Note: __llvm_rcp_f16 and __llvm_rcp_2f16 defines are removed as they are no longer used in the updated header file (implementation was changed from using those to using direct operations (such as division))

…m-6.1.0 release to fix __half2 issue. Only necessary edits/updates for __half2 are done in this version.
… yet). Added the needed OCML functions in spriv_math_fwd.h based on rocm-6.1.0 release.
…d ocml functions used in spirv_hip_fp16.h to spirv_math_fwd.h
…fp16_updates

pull changes from main to hip_fp16_updates before submitting PR
@pvelesko pvelesko merged commit 8f59383 into main Jul 30, 2024
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants