Skip to content
This repository has been archived by the owner on Mar 12, 2021. It is now read-only.

added reverse for arbitrary dimension CuArrays #389

Closed
wants to merge 0 commits into from
Closed

added reverse for arbitrary dimension CuArrays #389

wants to merge 0 commits into from

Conversation

kraftpunk97-zz
Copy link

julia> test_shapes = [
         [1, 2, 4, 3],
         [4, 2],
         [5],
         [2^5, 2^5, 2^5],
       ]
4-element Array{Array{Int64,1},1}:
 [1, 2, 4, 3]
 [4, 2]      
 [5]         
 [32, 32, 32]

julia> for testshape ∈ test_shapes
           a = Float32.(reshape(Vector(1:prod(testshape)), testshape...))
           a_ = cu(a)
           b = similar(a)
           b_ = similar(a_)

           println("testing for shape: $testshape")
           for i=1:length(testshape)
               b = reverse(a, dims=i)
               my_reverse(a_, b_, i)
               @test all(cu(b) .== b_)
               println("dimension $i is ok.")
           end
           
           println()
       end
testing for shape: [1, 2, 4, 3]
dimension 1 is ok.
dimension 2 is ok.
dimension 3 is ok.
dimension 4 is ok.

testing for shape: [4, 2]
dimension 1 is ok.
dimension 2 is ok.

testing for shape: [5]
dimension 1 is ok.

testing for shape: [32, 32, 32]
dimension 1 is ok.
dimension 2 is ok.
dimension 3 is ok.

@kraftpunk97-zz
Copy link
Author

benchmarks...

shape_ = [2^10, 2^10];
N = 2^10
a = Float32.(reshape(Vector(1:prod(shape_)), shape_...))# |> cu
a_ = cu(a)
b = similar(a)
b_ = similar(a_)

julia> @btime b =reverse(a, dims=2)
  740.175 μs (6 allocations: 4.00 MiB)

julia> @btime my_reverse(a_, b_, 2)
  129.330 μs (51 allocations: 1.83 KiB)

src/array.jl Outdated Show resolved Hide resolved
@kraftpunk97-zz
Copy link
Author

pinging @vchuravy

@kraftpunk97-zz
Copy link
Author

pinging @maleadt. Had this pull request going stale. Please review and suggest further changes, if any. Would be awesome if this could get merged 🙂

@maleadt
Copy link
Member

maleadt commented Sep 16, 2019

Sorry for not responding. Pushed some small improvements.

@vchuravy we do need shared memory in the case input and output point to the same array (that is what we do with the other reverse kernel). That should probably be more clear by having two versions, one in-place with shared memory, and another out-of-place without.

@maleadt
Copy link
Member

maleadt commented Sep 16, 2019

@kraftpunk97 could you check "allow pushes from maintainers"?

@kraftpunk97-zz
Copy link
Author

It's already selected.

@maleadt maleadt closed this Sep 16, 2019
@maleadt
Copy link
Member

maleadt commented Sep 16, 2019

Ugh, seems like I broke the PR. I'll open a new one...

@maleadt maleadt mentioned this pull request Sep 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants