-
Notifications
You must be signed in to change notification settings - Fork 78
Description
Hi,
As pointed out at ROCm/hipamd#65 , match_any/match_all are not available in HIP.
These are available in CUDA (cf. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-match-functions ), and can be implemented on AMD GPUs on Vega+ architectures (such intrinsic corresponds to "WaveMatch" in HLSL shader model 6.5 https://microsoft.github.io/DirectX-Specs/d3d/HLSL_ShaderModel6_5.html#wavematch-function which is supported by Vega+).
Therefore it seems like they can and should be added.
match_any
can for example be implemented as seen at llvm/llvm-project#62477 :
static inline __device__ uint64_t __match_any(int value) {
bool active = true;
uint64_t result = 0;
while (active) {
// determine what threads have the same value as the currently first active thread
int first_active_value = __builtin_amdgcn_readfirstlane(value);
int predicate = (value == first_active_value);
uint64_t m = __ballot(predicate); // THIS LINE IS PROBLEMATIC
// if the current thread has the same value, set its result mask to the current one
if (predicate) {
result |= m;
active = false;
}
}
return result;
}
There used to be compiler bugs making it hard to implement them as with the code above, but they have been fixed.
Feel free to use that code if you want to.
Best regards,
Epliz