-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU tweaks #63
Comments
This is an important point, thank you @luraess !! |
Yes, and I suspect this may address part of your perf issue you report wrt to tuning kernel launch params.
The behaviour of AMDGPU wrt synchronisation should be fairly similar to CUDA. Unless specified otherwise, kernel are launched on the task local default device and normal priority stream. On the default device and default queue, kernel execution is ordered (and would not need explicit synchronisation). |
Looking into the scripts, I see you are using AMDGPU v0.8 which has now the same "convention" as CUDA wrt using threads and blocks as kernel launch params;
gridsize = blocks
and thusJACC.jl/ext/JACCAMDGPU/JACCAMDGPU.jl
Lines 7 to 9 in 041c271
should be:
Regarding the issue JuliaGPU/AMDGPU.jl#614 it could be that using weakdeps (adding Project.toml to
/test
) in tests could solve the issue as since JACC is using extensions one could make sure not relying on conditional loading but truly on extension mechanism?Also, it looks like you are running AMDGPU CI on Julia 1.9. There used to be issues because of LLVM on Julia 1.9 and thus Julia 1.10 could globally be preferred (although depending on the GPU 1.9 would work fine).
The text was updated successfully, but these errors were encountered: