-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove event system from KA and update to CUDA v4.1 #2924
Conversation
Nice, how do you go about launching overlapping asynchronous kernels this way? |
That's yet to be determined. I was talking to @maleadt about this. Maybe I will add a |
Are the current options ( |
My goal right now is to do all of the changes without introducing new constructs and then evaluate performance from there. |
@vchuravy I recall long ago we discussed introducing a "stream" abstraction as an alternative to events for asynchronous computations? This would parallel CUDA's streams but I guess would have to be something entirely new for CPU? |
In discussion with Tim I decided against it. I tried implementing it on the CPU, but that's basically why I required events everywhere. See https://github.com/JuliaGPU/KernelAbstractions.jl/blob/main/examples/mpi.jl, but Tim and I are bullish on using Julia's native concurrency mechanism, to express what streams normally do. |
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic work! It's not just an upgrade, the code is much simpler and I feel relieved. My comments are all cosmetic.
src/Models/HydrostaticFreeSurfaceModels/barotropic_pressure_correction.jl
Outdated
Show resolved
Hide resolved
src/Models/HydrostaticFreeSurfaceModels/implicit_free_surface.jl
Outdated
Show resolved
Hide resolved
src/Models/HydrostaticFreeSurfaceModels/split_explicit_free_surface_kernels.jl
Outdated
Show resolved
Hide resolved
src/Models/NonhydrostaticModels/calculate_nonhydrostatic_tendencies.jl
Outdated
Show resolved
Hide resolved
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
…rrection.jl Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
…ncies.jl Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
…nto vc/ka_transition
We'll have to generate some new benchmarks after this especially for smaller / 1D / 2D problems |
Or even before this is merged?
|
Crayons = "4" | ||
CubedSphere = "0.1, 0.2" | ||
DocStringExtensions = "0.8, 0.9" | ||
FFTW = "1" | ||
Glob = "1.3" | ||
IncompleteLU = "0.2" | ||
IterativeSolvers = "0.9" | ||
JLD2 = "0.4" | ||
KernelAbstractions = "0.8" | ||
JLD2 = "^0.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the difference of "0.4"
and "^0.4"
? I thought they are the same.
This likely removes some of the single device parallelism,
but decreases the risk for user mistake (everything is stream-ordered)
This will cause severe merge conflicts with open PRs and depends on currently unreleased
versions of CUDA & KernelAbstractions. So I might regret doing it now, but I was curious
to see what the damage would be.