Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Would like to pass in more than just numbers and arrays #56

Open
PhilipFackler opened this issue Mar 15, 2024 · 8 comments
Open

Would like to pass in more than just numbers and arrays #56

PhilipFackler opened this issue Mar 15, 2024 · 8 comments

Comments

@PhilipFackler
Copy link
Collaborator

PhilipFackler commented Mar 15, 2024

This commit causes me problems:
6d12657

parallel_for now accepts only Numbers and Arrays (or CuArrays in the cuda case). I'd like to pass in instances of structs and tuples also. Perhaps the function could use Vararg{Any}?

@williamfgc
Copy link
Collaborator

@michel2323 please let us know your thoughts on this issue. Thanks!

@michel2323
Copy link
Contributor

Oh ok. This restrictive. Okay. I'm gonna take look.

@michel2323
Copy link
Contributor

The problem with Vararg{Any} is that dispatch and precompilation won't work. We need different types for CPU, CUDA, etc.

@michel2323
Copy link
Contributor

michel2323 commented Apr 4, 2024

How do you want to decide a struct should be on the CPU or GPU in the loop?

@michel2323
Copy link
Contributor

michel2323 commented Apr 4, 2024

I looked at OpenMP and they have the target in the #pragma. So one solution would be to pass a backend argument to the parallel_for etc. CUDA.jl, oneAPI.jl, and AMDGPU.jl provide a CUDABackend, oneAPIBackend(), and ROCBackend(). See for example here https://github.com/JuliaGPU/CUDA.jl/blob/7f725c0a117c2ba947015f48833630605501fb3a/src/CUDAKernels.jl#L21 .

For the CPU we could define our own or use it straight from KA. All this would also remove the need for JACC.Array.

@PhilipFackler
Copy link
Collaborator Author

PhilipFackler commented Apr 4, 2024

How do you want to decide a struct should be on the CPU or GPU in the loop?

This is when launching a kernel. As long as the struct meets isbitstype I should be able to copy it into the kernel just like a Number. And this works with the earlier version that uses ... parameters.

@kmp5VT
Copy link

kmp5VT commented Apr 22, 2024

@michel2323 @PhilipFackler I have a general rough draft of this idea in this PR. This is similar to how we launch CPU vs GPU kernels in the ITensors.jl package

@williamfgc
Copy link
Collaborator

williamfgc commented Apr 22, 2024

@kmp5VT thanks, we'd like to explore these ideas keeping the API really simple for end-users who would like to stay closer to their science and not necessarily computational aspects. Otherwise, there is very little added value for using JACC.

A good exercise is to examine the final API and integration effort. Currently, we are prioritizing issues that allow for easy integration with apps and less maintenance. Most of the design decisions are from looking at apps and what they need and make incremental progress.

@PhilipFackler posted a minimal example in #51

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants