Cannot use `lambertw` on the GPU #30

tomchor · 2024-01-24T17:13:20Z

I've been trying to use this function on the GPU but I always get an error (the original issue I posted is CliMA/Oceananigans.jl#3438). Mostly I work indirectly with KernelAbstractions.jl, and the following MWE illustrates the error I'm getting:

using KernelAbstractions
using LambertW: lambertw

@kernel function mul2_kernel(A)
  I = @index(Global)
  A[I] = lambertw(A[I])
end

using CUDA: CuArray
A = CuArray(ones(10, 10))

backend = get_backend(A)
mul2_kernel(backend, 64)(A, ndrange=size(A))
synchronize(backend)
display(A)

Instead of this working, I get a huge error that starts with

ERROR: LoadError: InvalidIRError: compiling MethodInstance for gpu_mul2_kernel(::KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{2, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.StaticSize{(64, 1)}, CartesianIndices{2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}, Nothing}}, ::CUDA.CuDeviceMatrix{Float64, 1}) resulted in invalid LLVM IR
Reason: unsupported call through a literal pointer (call to __memcmp_evex_movbe)
Stacktrace:
 [1] _memcmp
   @ ./strings/string.jl:124
 [2] ==
   @ ./strings/string.jl:136
 [3] multiple call sites
   @ unknown:0
Reason: unsupported call to an unknown function (call to julia.get_pgcstack)
Stacktrace:
 [1] multiple call sites
   @ unknown:0

You can see whole error here.

I also was able to come up with a simpler example using CUDA.jl that produces a similar error, but for some reason occasionally this fails with a segfault, so I'm not sure if the same thing is going on here (or maybe I'm doing something wrong since I've never really worked with CUDA directly). Here's the CUDA MWE:

using LambertW: lambertw
using CUDA: CuArray

A = CuArray(ones(10, 10))
B = lambertw.(A)
display(B)

Is there any way to make lambertw work on the GPU? More specifically work with KernelAbstractions?

Thanks!

The text was updated successfully, but these errors were encountered:

jlapeyre · 2024-01-25T19:30:42Z

See

CliMA/Oceananigans.jl#3438 (comment)

I am pretty busy with my day job, etc. But the best solution i can think of of the top of my head is to start by returning the result of the root finding and the number of iterations. And put a nice interface on it. And avoid the warning altogether.

If you can run a forked or local version (if possible), try commenting out the @warn line and see if the error disappears. Just to confirm that is really the problem.

tomchor · 2024-01-26T00:58:37Z

Thanks for checking into this. I haven't tested it myself, but it seems that indeed that @warn line is the cause of the error here, even though I never reach that error.

I can of course fork the repo, comment that line out, and use it for my purposes, but I'd like to ask if you'd like the code here to be changed instead? In general I think it's advantageous to have packages be GPU-compatible, but I'm curious about your thoughts.

If you agree I could submit a PR changing the function to return the value, plus 0 when converged and 1 when not, or something along the lines of what you suggested.

tomchor · 2024-01-26T01:03:54Z

Also I see there's been some discussion about possibly moving this package to SpecialFunctions.jl in JuliaMath/SpecialFunctions.jl#371. In my opinion moving lambertw there would result in more eyes on it for issues like this. Since you seem to be the main developer here according the contributions page, it would put less pressure on you to interact with issues like this one when you're busy, which seems advantageous. Just a thought :)

jlapeyre · 2024-01-26T17:05:38Z

To me, it's very clear that the current design is wrong. There should be at least an inner function, with a good interface, that does not do IO.
A PR would be good. I'd like to try to get a good design because this package has a significant number of users. In particular, best would be if current users see no visible change. That is, they can use the same API they are already using. (There is no reason to drop the warning for these users, but even if we did, I doubt it would make any difference as I think it is difficult to impossible for the root finding to fail.
You saw an example of the discussion of whether to aggregate or dis-aggregate software. It goes on ad infinitum but never is data mentioned. We decided to take the first step of moving this repo from my personal account to the JuliaMath org. I thought the wiser prediction was that it would see no change in the attention this package gets. It seems clear that prediction was borne out. No one but me has done anything to maintain this package. I'm not really against moving it to SpecialFunctions. But I would be very surprised if it received any more attention there just due to it's new location. On the other hand, I'd be more reluctant to merge my own PRs there. I'd perfer not to do it here either. But, although several other people have merge privileges for this repo, I don't expect any to do it.

Perhaps a good starting point is a subtype NoWarn of LambertWMode. This is a minimal change in the API to satisfy the needs of GPU users. So: lambertw(z,k,maxits) as before for cpu users. And maybe lambertw(z,k,maxits; mode=NoWarn()) for GPU users. These would be wrappers around a function that does the computation and returns in some way a pair of items: the result and a flag indicating success or failure.

jlapeyre · 2024-10-03T02:40:03Z

@tomchor

This issue is fixed in

Change API regarding convergence information #33

It is released as version 1.0.0 here

Set version number to 1.0.0 #34

tomchor · 2024-10-03T12:55:26Z

Awesome! Thanks

tomchor mentioned this issue Jan 26, 2024

Comment @warning tomchor/LambertW.jl#1

Open

jlapeyre closed this as completed Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot use `lambertw` on the GPU #30

Cannot use `lambertw` on the GPU #30

tomchor commented Jan 24, 2024

jlapeyre commented Jan 25, 2024

tomchor commented Jan 26, 2024

tomchor commented Jan 26, 2024

jlapeyre commented Jan 26, 2024

jlapeyre commented Oct 3, 2024 •

edited

Loading

tomchor commented Oct 3, 2024

Cannot use lambertw on the GPU #30

Cannot use lambertw on the GPU #30

Comments

tomchor commented Jan 24, 2024

jlapeyre commented Jan 25, 2024

tomchor commented Jan 26, 2024

tomchor commented Jan 26, 2024

jlapeyre commented Jan 26, 2024

jlapeyre commented Oct 3, 2024 • edited Loading

tomchor commented Oct 3, 2024

Cannot use `lambertw` on the GPU #30

Cannot use `lambertw` on the GPU #30

jlapeyre commented Oct 3, 2024 •

edited

Loading