Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use lambertw on the GPU #30

Closed
tomchor opened this issue Jan 24, 2024 · 6 comments
Closed

Cannot use lambertw on the GPU #30

tomchor opened this issue Jan 24, 2024 · 6 comments

Comments

@tomchor
Copy link

tomchor commented Jan 24, 2024

I've been trying to use this function on the GPU but I always get an error (the original issue I posted is CliMA/Oceananigans.jl#3438). Mostly I work indirectly with KernelAbstractions.jl, and the following MWE illustrates the error I'm getting:

using KernelAbstractions
using LambertW: lambertw

@kernel function mul2_kernel(A)
  I = @index(Global)
  A[I] = lambertw(A[I])
end

using CUDA: CuArray
A = CuArray(ones(10, 10))

backend = get_backend(A)
mul2_kernel(backend, 64)(A, ndrange=size(A))
synchronize(backend)
display(A)

Instead of this working, I get a huge error that starts with

ERROR: LoadError: InvalidIRError: compiling MethodInstance for gpu_mul2_kernel(::KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{2, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.StaticSize{(64, 1)}, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, Nothing}}, ::CUDA.CuDeviceMatrix{Float64, 1}) resulted in invalid LLVM IR
Reason: unsupported call through a literal pointer (call to __memcmp_evex_movbe)
Stacktrace:
 [1] _memcmp
   @ ./strings/string.jl:124
 [2] ==
   @ ./strings/string.jl:136
 [3] multiple call sites
   @ unknown:0
Reason: unsupported call to an unknown function (call to julia.get_pgcstack)
Stacktrace:
 [1] multiple call sites
   @ unknown:0

You can see whole error here.

I also was able to come up with a simpler example using CUDA.jl that produces a similar error, but for some reason occasionally this fails with a segfault, so I'm not sure if the same thing is going on here (or maybe I'm doing something wrong since I've never really worked with CUDA directly). Here's the CUDA MWE:

using LambertW: lambertw
using CUDA: CuArray

A = CuArray(ones(10, 10))
B = lambertw.(A)
display(B)

Is there any way to make lambertw work on the GPU? More specifically work with KernelAbstractions?

Thanks!

@jlapeyre
Copy link
Collaborator

See

CliMA/Oceananigans.jl#3438 (comment)

I am pretty busy with my day job, etc. But the best solution i can think of of the top of my head is to start by returning the result of the root finding and the number of iterations. And put a nice interface on it. And avoid the warning altogether.

If you can run a forked or local version (if possible), try commenting out the @warn line and see if the error disappears. Just to confirm that is really the problem.

@tomchor
Copy link
Author

tomchor commented Jan 26, 2024

Thanks for checking into this. I haven't tested it myself, but it seems that indeed that @warn line is the cause of the error here, even though I never reach that error.

I can of course fork the repo, comment that line out, and use it for my purposes, but I'd like to ask if you'd like the code here to be changed instead? In general I think it's advantageous to have packages be GPU-compatible, but I'm curious about your thoughts.

If you agree I could submit a PR changing the function to return the value, plus 0 when converged and 1 when not, or something along the lines of what you suggested.

@tomchor
Copy link
Author

tomchor commented Jan 26, 2024

Also I see there's been some discussion about possibly moving this package to SpecialFunctions.jl in JuliaMath/SpecialFunctions.jl#371. In my opinion moving lambertw there would result in more eyes on it for issues like this. Since you seem to be the main developer here according the contributions page, it would put less pressure on you to interact with issues like this one when you're busy, which seems advantageous. Just a thought :)

@jlapeyre
Copy link
Collaborator

  • To me, it's very clear that the current design is wrong. There should be at least an inner function, with a good interface, that does not do IO.
  • A PR would be good. I'd like to try to get a good design because this package has a significant number of users. In particular, best would be if current users see no visible change. That is, they can use the same API they are already using. (There is no reason to drop the warning for these users, but even if we did, I doubt it would make any difference as I think it is difficult to impossible for the root finding to fail.
  • You saw an example of the discussion of whether to aggregate or dis-aggregate software. It goes on ad infinitum but never is data mentioned. We decided to take the first step of moving this repo from my personal account to the JuliaMath org. I thought the wiser prediction was that it would see no change in the attention this package gets. It seems clear that prediction was borne out. No one but me has done anything to maintain this package. I'm not really against moving it to SpecialFunctions. But I would be very surprised if it received any more attention there just due to it's new location. On the other hand, I'd be more reluctant to merge my own PRs there. I'd perfer not to do it here either. But, although several other people have merge privileges for this repo, I don't expect any to do it.

Perhaps a good starting point is a subtype NoWarn of LambertWMode. This is a minimal change in the API to satisfy the needs of GPU users. So: lambertw(z,k,maxits) as before for cpu users. And maybe lambertw(z,k,maxits; mode=NoWarn()) for GPU users. These would be wrappers around a function that does the computation and returns in some way a pair of items: the result and a flag indicating success or failure.

@jlapeyre
Copy link
Collaborator

jlapeyre commented Oct 3, 2024

@tomchor

This issue is fixed in

It is released as version 1.0.0 here

@jlapeyre jlapeyre closed this as completed Oct 3, 2024
@tomchor
Copy link
Author

tomchor commented Oct 3, 2024

Awesome! Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants