really slow reduce over single dimension #362

jumerckx · 2019-07-01T18:36:30Z

reduce (as well as maximum, minimum but not sum...) is really slow on CuArrays when the reduction happens over a single dimension, a tuple of dimensions however, is a lot faster (multiple orders of magnitude).
When I set allowscalar to false , reduce, maximum and minium fail.

To Reproduce
The Minimal Working Example (MWE) for this bug:

using CuArrays
cuin = cu(rand(1000, 100, 100))

@time maximum(cuin, dims=2)
@time maximum(cuin, dims=(1,2))
# run twice ^

CuArrays.allowscalar(false)
maximum(cuin, dims=2)

On my machine, with dims=2 it takes 8.80s while it only takes 0.01s with dims=(2, 1)
After disallowing scalar indexing, the code above throws scalar getindex is disallowed

Environment details
Details on Julia:

Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = "C:\Users\jules\AppData\Local\atom\app-1.38.2\atom.exe"  -a
  JULIA_NUM_THREADS = 4```

The text was updated successfully, but these errors were encountered:

tanhevg · 2019-07-13T22:32:11Z

A quick workaround for this would be something like:

import Base.minimum
function minimum(x::CuArray{T}; dims=1:ndims(x)) where {T}
    mx = fill!(similar(x, T, Base.reduced_indices(x, dims)), typemax(T))
    Base._mapreducedim!(identity, min, mx, x)
end

This reduces the time from 1s to 0.002s for the MWE on my machine. This works, because _mapreducedim! is specialised in CuArrays.

The scalar getindex is hit by mapfoldl which is called by the complex initialisation workflow in Base.reducedim.jl. I guess fixing this properly would entail writing a kernel for mapflodl, however I am not 100% clear why this complex initialisation for min/max is required in the first place.

maleadt · 2019-08-23T11:31:01Z

Yeah sure that would be a good workaround for now.

#362: specialise reduce, minimum and maximum for CuArrays

baggepinnen · 2019-09-03T03:19:06Z

Any ideas on how to backpropagate through maximum, as findmax uses scalar getindex and thus is very slow? I tried stuff like

Base.findmax(x::CuArray{T}; dims=:) where {T} = reduce(x, init=(typemin(T),1), dims=dims) do ((a,ai),(b,bi))
    a > b ? (a,ai) : (b,bi)
end

without any luck

tanhevg · 2019-09-03T11:03:15Z

@baggepinnen: have a look at argmax, but be wary of #304

baggepinnen · 2019-09-03T11:10:35Z

Thanks! In my context I managed to get around the problem by creating a binary mask, but am looking forward to seeing progress on #304 in general.

jumerckx added the bug label Jul 1, 2019

brainsqueeze mentioned this issue Aug 25, 2019

Slow backprop on GPU brainsqueeze/SequenceModels.jl#1

Closed

tanhevg pushed a commit to tanhevg/CuArrays.jl that referenced this issue Aug 27, 2019

JuliaGPU#362: specialise reduce, minimum and maximum for CuArrays

70714d4

tanhevg mentioned this issue Aug 27, 2019

#362: specialise reduce, minimum and maximum for CuArrays #405

Merged

tanhevg pushed a commit to tanhevg/CuArrays.jl that referenced this issue Aug 28, 2019

JuliaGPU#362: More tests and documentation; simplify the dims

b0662eb

maleadt added a commit that referenced this issue Aug 29, 2019

Merge pull request #405 from tanhevg/tanhevg/slow_reduce_362

f7055cb

#362: specialise reduce, minimum and maximum for CuArrays

maleadt closed this as completed Aug 29, 2019

maleadt referenced this issue Nov 4, 2019

Test released Flux.

7642038

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

really slow reduce over single dimension #362

really slow reduce over single dimension #362

jumerckx commented Jul 1, 2019

tanhevg commented Jul 13, 2019 •

edited

Loading

maleadt commented Aug 23, 2019

baggepinnen commented Sep 3, 2019

tanhevg commented Sep 3, 2019

baggepinnen commented Sep 3, 2019

really slow reduce over single dimension #362

really slow reduce over single dimension #362

Comments

jumerckx commented Jul 1, 2019

tanhevg commented Jul 13, 2019 • edited Loading

maleadt commented Aug 23, 2019

baggepinnen commented Sep 3, 2019

tanhevg commented Sep 3, 2019

baggepinnen commented Sep 3, 2019

tanhevg commented Jul 13, 2019 •

edited

Loading