Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge upstream changes. #1

Merged
merged 44 commits into from
May 5, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
1a4b08f
Actually test error throwing for lsvqr
kshyatt Feb 24, 2020
31560da
Implement nnz, nonzeros and nonzeroinds
janEbert Jan 25, 2020
22c9ec5
Test nnz, nonzeros and nonzeroinds
janEbert Jan 25, 2020
9fb78a8
Implement getindex
janEbert Apr 2, 2020
3399eb2
Update dependencies.
github-actions[bot] Apr 6, 2020
15bd2ae
Allow scalar indexing where necessary and add a few tests
kshyatt Apr 6, 2020
06a1406
Merge pull request #674 from JuliaGPU/update_manifest
maleadt Apr 6, 2020
3268f62
Fix scalar indexing
kshyatt Apr 7, 2020
3f2ce5b
use the allowscalar macro
kshyatt Apr 8, 2020
cc785c4
Merge #675
bors[bot] Apr 8, 2020
9c41765
Update dependencies.
github-actions[bot] Apr 13, 2020
4c6bb75
Merge pull request #681 from JuliaGPU/update_manifest
maleadt Apr 13, 2020
a1a4dc0
Merge #604
bors[bot] Apr 13, 2020
aeba8c9
Wrapper functions for NNlib (#615)
matsueushi Apr 13, 2020
24c6d47
Update dependencies.
github-actions[bot] Apr 20, 2020
5737865
Merge pull request #686 from JuliaGPU/update_manifest
maleadt Apr 20, 2020
c38da71
Bump version
maleadt Apr 20, 2020
67e322b
fix 1d convolution
AStupidBear Apr 22, 2020
8acbd05
Test getindex
janEbert Apr 2, 2020
4783f83
add comment to fix1d
AStupidBear Apr 23, 2020
df0dc90
Disable the GC after taking pool-related spinlocks.
maleadt Apr 24, 2020
e6b5376
Merge pull request #690 from AStupidBear/conv1d
maleadt Apr 24, 2020
c8d9a9b
Merge pull request #572 from janEbert/sparse
maleadt Apr 24, 2020
92082a5
Use released GPUArrays.
maleadt Apr 24, 2020
2d99e03
Only disable finalizers, not the entire GC.
maleadt Apr 25, 2020
6b1bc0d
Enable finalizers only after releasing the lock.
maleadt Apr 26, 2020
a145023
Update src/memory.jl
maleadt Apr 26, 2020
f380ccd
Fix sparse mul!
amontoison Apr 22, 2020
c0a73bc
Improve mul! testset names
amontoison Apr 23, 2020
2ec414f
Fix A' * x products for complex CuSparseMatrixCSC
amontoison Apr 26, 2020
8a4eee8
Update dependencies.
github-actions[bot] Apr 27, 2020
ee16e77
Merge pull request #694 from JuliaGPU/update_manifest
maleadt Apr 27, 2020
16c0080
Merge #692
bors[bot] Apr 27, 2020
5970046
adds wrappers for syevjBatched/heevjBatched family of CUSOLVER functions
electronsandstuff Apr 27, 2020
2258a24
Merge pull request #693 from JuliaGPU/tb/disable_gc
maleadt Apr 28, 2020
1cab188
Merge pull request #695 from electronsandstuff/electronsandstuff/syev…
maleadt Apr 30, 2020
4689060
Don't require capability 7.5 for testing.
maleadt May 1, 2020
5e081f5
Update dependencies.
github-actions[bot] May 4, 2020
d06fea8
Merge pull request #702 from JuliaGPU/update_manifest
maleadt May 4, 2020
b4ef93d
Repopulate the pool from freed blocks before allocating.
maleadt May 4, 2020
31916fb
Add missing lock.
maleadt May 4, 2020
f44cf19
Merge pull request #704 from JuliaGPU/tb/binned_repopulate
maleadt May 4, 2020
4742b8a
CEnum 0.3 compatibility.
maleadt May 5, 2020
eac33f8
Merge pull request #705 from JuliaGPU/tb/cenum
maleadt May 5, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ julia:1.3:
- .test
tags:
- nvidia
- sm_75
- sm_70
variables:
CI_THOROUGH: 'true'

Expand Down
71 changes: 25 additions & 46 deletions Manifest.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ version = "1.0.1"
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"

[[BinaryProvider]]
deps = ["Libdl", "SHA"]
git-tree-sha1 = "5b08ed6036d9d3f0ee6369410b830f8873d4024c"
deps = ["Libdl", "Logging", "SHA"]
git-tree-sha1 = "428e9106b1ff27593cbd979afac9b45b82372b8c"
uuid = "b99e7846-7c00-51b0-8f62-c81ae34c0232"
version = "0.5.8"
version = "0.5.9"

[[CEnum]]
git-tree-sha1 = "62847acab40e6855a9b5905ccb99c2b5cf6b3ebb"
Expand All @@ -34,59 +34,52 @@ version = "4.0.0"

[[CUDAdrv]]
deps = ["CEnum", "CUDAapi", "Printf"]
git-tree-sha1 = "e650cbaee92b60433313157926b1e80d0c3a0e2e"
git-tree-sha1 = "17248da4169c0cdd1699da542f8e110fe4168af6"
uuid = "c5f51814-7f29-56b8-a69c-e4d8f6be1fde"
version = "6.2.2"
version = "6.2.3"

[[CUDAnative]]
deps = ["Adapt", "BinaryProvider", "CEnum", "CUDAapi", "CUDAdrv", "Cthulhu", "DataStructures", "InteractiveUtils", "LLVM", "Libdl", "MacroTools", "Pkg", "Printf", "TimerOutputs"]
git-tree-sha1 = "d1fc99635d0002c8a819b78cb1f441eb44310725"
deps = ["Adapt", "BinaryProvider", "CEnum", "CUDAapi", "CUDAdrv", "Cthulhu", "DataStructures", "ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Pkg", "Printf", "TimerOutputs"]
git-tree-sha1 = "0da071ed49a6f5f62d5164de071daa07cedaa1e6"
uuid = "be33ccc6-a3ff-5ff2-a52e-74243cff1e17"
version = "3.0.2"
version = "3.0.4"

[[CodeTracking]]
deps = ["InteractiveUtils", "UUIDs"]
git-tree-sha1 = "0becdab7e6fbbcb7b88d8de5b72e5bb2f28239f3"
git-tree-sha1 = "cab4da992adc0a64f63fa30d2db2fd8bec40cab4"
uuid = "da1fd8a2-8d9e-5ec2-8556-3022fb5608a2"
version = "0.5.8"

[[Compat]]
deps = ["Base64", "Dates", "DelimitedFiles", "Distributed", "InteractiveUtils", "LibGit2", "Libdl", "LinearAlgebra", "Markdown", "Mmap", "Pkg", "Printf", "REPL", "Random", "Serialization", "SharedArrays", "Sockets", "SparseArrays", "Statistics", "Test", "UUIDs", "Unicode"]
git-tree-sha1 = "ed2c4abadf84c53d9e58510b5fc48912c2336fbb"
uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
version = "2.2.0"
version = "0.5.11"

[[Cthulhu]]
deps = ["CodeTracking", "InteractiveUtils", "TerminalMenus", "Unicode"]
git-tree-sha1 = "5e0f928ccaab1fa2911fc4e204e8a6f5b0213eaf"
deps = ["CodeTracking", "InteractiveUtils", "REPL", "Unicode"]
git-tree-sha1 = "a4849ec61df9659423cc63b298ed895904ee9743"
uuid = "f68482b8-f384-11e8-15f7-abe071a5a75f"
version = "1.0.0"
version = "1.0.2"

[[DataStructures]]
deps = ["InteractiveUtils", "OrderedCollections"]
git-tree-sha1 = "5a431d46abf2ef2a4d5d00bd0ae61f651cf854c8"
git-tree-sha1 = "6166ecfaf2b8bbf2b68d791bc1d54501f345d314"
uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
version = "0.17.10"
version = "0.17.15"

[[Dates]]
deps = ["Printf"]
uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"

[[DelimitedFiles]]
deps = ["Mmap"]
uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab"

[[Distributed]]
deps = ["Random", "Serialization", "Sockets"]
uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"

[[ExprTools]]
git-tree-sha1 = "6f0517056812fd6aa3af23d4b70d5325a2ae4e95"
uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
version = "0.1.1"

[[GPUArrays]]
deps = ["AbstractFFTs", "Adapt", "LinearAlgebra", "Printf", "Random", "Serialization"]
git-tree-sha1 = "50542dca6e8339a5e0a6718283f956187123234a"
repo-rev = "cb79e08c09ca0eb776c1ded7b7fe8876bd012981"
repo-url = "https://github.com/JuliaGPU/GPUArrays.jl.git"
git-tree-sha1 = "c63cb01e3b6f48ab39f1e35c31ba870650814a18"
uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7"
version = "3.1.0"
version = "3.2.0"

[[InteractiveUtils]]
deps = ["Markdown"]
Expand All @@ -99,7 +92,6 @@ uuid = "929cbde3-209d-540e-8aea-75f648917ca0"
version = "1.3.4"

[[LibGit2]]
deps = ["Printf"]
uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"

[[Libdl]]
Expand All @@ -122,9 +114,6 @@ version = "0.5.5"
deps = ["Base64"]
uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"

[[Mmap]]
uuid = "a63ad114-7e13-5084-954f-fe012c677804"

[[NNlib]]
deps = ["BinaryProvider", "Libdl", "LinearAlgebra", "Requires", "Statistics"]
git-tree-sha1 = "d9f196d911f55aeaff11b11f681b135980783824"
Expand All @@ -138,7 +127,7 @@ uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
version = "1.1.0"

[[Pkg]]
deps = ["Dates", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "UUIDs"]
deps = ["Dates", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Test", "UUIDs"]
uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"

[[Printf]]
Expand Down Expand Up @@ -171,10 +160,6 @@ uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
[[Serialization]]
uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"

[[SharedArrays]]
deps = ["Distributed", "Mmap", "Random", "Serialization"]
uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383"

[[Sockets]]
uuid = "6462fe0b-24de-5631-8697-dd941f90decc"

Expand All @@ -186,21 +171,15 @@ uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
deps = ["LinearAlgebra", "SparseArrays"]
uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"

[[TerminalMenus]]
deps = ["Compat", "REPL", "Test"]
git-tree-sha1 = "9ae6ed0c94eee4d898e049820942af21daf15efc"
uuid = "dc548174-15c3-5faf-af27-7997cfbde655"
version = "0.1.0"

[[Test]]
deps = ["Distributed", "InteractiveUtils", "Logging", "Random"]
uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[[TimerOutputs]]
deps = ["Printf"]
git-tree-sha1 = "311765af81bbb48d7bad01fb016d9c328c6ede03"
git-tree-sha1 = "0cc8db57cb537191b02948d4fabdc09eb7f31f98"
uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
version = "0.5.3"
version = "0.5.5"

[[UUIDs]]
deps = ["Random", "SHA"]
Expand Down
6 changes: 3 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name = "CuArrays"
uuid = "3a865a2d-5b23-5a0f-bc46-62713ec82fae"
version = "2.0.1"
version = "2.1.0"

[deps]
AbstractFFTs = "621f4979-c628-5d54-868e-fcf4e3e8185c"
Expand All @@ -27,12 +27,12 @@ TimerOutputs = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
[compat]
AbstractFFTs = "0.4, 0.5"
Adapt = "1.0"
CEnum = "0.2"
CEnum = "0.2, 0.3"
CUDAapi = "3.0, 4.0"
CUDAdrv = "6.0.1"
CUDAnative = "3.0"
DataStructures = "0.17"
GPUArrays = "3.1"
GPUArrays = "3.2"
MacroTools = "0.5"
NNlib = "0.6.5"
Reexport = "0.2"
Expand Down
32 changes: 23 additions & 9 deletions src/dnn/nnlib.jl
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,25 @@ end

# Convolution

# Since CUDNN does not support 1D convolution, Conv in Flux will give a CUDNNError if the size is 1-dimensional.
# We have to reshape the CuArray/PoolDims/DenseConvDims to 4D before feeding to CUDNN.
fix1d(x) = x

fix1d(x::CuArray{T, 3}) where T = reshape(x, size(x, 1), 1, size(x, 2), size(x, 3))

fix1d(cdims::DenseConvDims{1,K,C_in,C_out,S,P,D,F}) where {K,C_in,C_out,S,P,D,F} =
DenseConvDims{2,(K...,1),C_in,C_out,(S...,1),(P...,0,0),(D...,1),F}((cdims.I...,1))

fix1d(pdims::PoolDims{1,K,S,P,D}) where {K,S,P,D,F} =
PoolDims{2,(K...,1),(S...,1),(P...,0,0),(D...,1)}((pdims.I..., 1), pdims.C_in)

function conv!(y::CuArray{T}, x::CuArray{T}, w::CuArray{T}, cdims::DenseConvDims;
alpha=1, algo=0) where T<:CUDNNFloat
if version() < v"6"
all(x -> x == 1, dilation(cdims)) || error("Only dilation = 1 is supported in cuDNN version < 6")
end

cudnnConvolutionForward(y, x, w, cdims, alpha=alpha, algo=algo)
cudnnConvolutionForward(fix1d(y), fix1d(x), fix1d(w), fix1d(cdims), alpha=alpha, algo=algo)
return y
end

function ∇conv_filter!(dw::CuArray{T}, x::CuArray{T}, dy::CuArray{T},
Expand All @@ -56,7 +68,8 @@ function ∇conv_filter!(dw::CuArray{T}, x::CuArray{T}, dy::CuArray{T},
all(x -> x == 1, dilation(cdims)) || error("Only dilation = 1 is supported in cuDNN version < 6")
end

cudnnConvolutionBackwardFilter(dw, x, dy, cdims, alpha=alpha, algo=algo)
cudnnConvolutionBackwardFilter(fix1d(dw), fix1d(x), fix1d(dy), fix1d(cdims), alpha=alpha, algo=algo)
return dw
end

function ∇conv_data!(dx::CuArray{T}, dy::CuArray{T}, w::CuArray{T},
Expand All @@ -65,22 +78,23 @@ function ∇conv_data!(dx::CuArray{T}, dy::CuArray{T}, w::CuArray{T},
all(x -> x == 1, dilation(cdims)) || error("Only dilation = 1 is supported in cuDNN version < 6")
end

cudnnConvolutionBackwardData(dx, w, dy, cdims, alpha=alpha, algo=algo)
cudnnConvolutionBackwardData(fix1d(dx), fix1d(w), fix1d(dy), fix1d(cdims), alpha=alpha, algo=algo)
return dx
end

∇conv_bias!(db::CuArray{T}, dy::CuArray{T}; alpha=1, beta=0) where T<:CUDNNFloat =
cudnnConvolutionBackwardBias(db, dy, alpha=alpha, beta=beta)
(cudnnConvolutionBackwardBias(fix1d(db), fix1d(dy), alpha=alpha, beta=beta); return db)

maxpool!(y::CuArray{T}, x::CuArray{T}, pdims::PoolDims) where T<:CUDNNFloat =
cudnnPoolingForward(y, x, pdims; mode=0)
(cudnnPoolingForward(fix1d(y), fix1d(x), fix1d(pdims); mode=0); return y)

∇maxpool!(dx::CuArray{T}, dy::CuArray{T}, y::CuArray{T}, x::CuArray{T},
pdims::PoolDims) where T<:CUDNNFloat =
cudnnPoolingBackward(dx, dy, x, y, pdims, mode=0)
(cudnnPoolingBackward(fix1d(dx), fix1d(dy), fix1d(x), fix1d(y), fix1d(pdims), mode=0); return dx)

meanpool!(y::CuArray{T}, x::CuArray{T}, pdims::PoolDims) where T<:CUDNNFloat =
cudnnPoolingForward(y, x, pdims, mode=1)
(cudnnPoolingForward(fix1d(y), fix1d(x), fix1d(pdims), mode=1); return y)

∇meanpool!(dx::CuArray{T}, dy::CuArray{T}, y::CuArray{T}, x::CuArray{T},
pdims::PoolDims) where T<:CUDNNFloat =
cudnnPoolingBackward(dx, dy, x, y, pdims, mode=1)
(cudnnPoolingBackward(fix1d(dx), fix1d(dy), fix1d(x), fix1d(y), fix1d(pdims), mode=1); return dx)
2 changes: 1 addition & 1 deletion src/dnn/rnn.jl
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ function RNNDesc{T}(mode::cudnnRNNMode_t, input::Int, hidden::Int; layers = 1) w
inputMode = CUDNN_LINEAR_INPUT
direction = CUDNN_UNIDIRECTIONAL
algo = CUDNN_RNN_ALGO_STANDARD
cudnnSetRNNDescriptor_v6(handle(),d[],hidden,layers,dropoutDesc,cudnnRNNInputMode_t(inputMode),cudnnDirectionMode_t(direction),mode,cudnnRNNAlgo_t(algo),cudnnDataType(T))
cudnnSetRNNDescriptor_v6(handle(),d[],hidden,layers,dropoutDesc,inputMode,direction,mode,algo,cudnnDataType(T))

w =CuArrays.zeros(T, rnnParamSize(T, d[], input))
# TODO: avoid reserve allocation here
Expand Down
22 changes: 20 additions & 2 deletions src/memory.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,24 @@ using Base.Threads: SpinLock
# each allocator needs to lock its own resources separately too.
const memory_lock = SpinLock()

# the above spinlocks are taken around code that might gc, which might cause a deadlock
# if we try to acquire from the finalizer too. avoid that by temporarily disabling running finalizers,
# concurrently on this thread.
enable_finalizers(on::Bool) = ccall(:jl_gc_enable_finalizers, Cvoid, (Ptr{Cvoid}, Int32,), Core.getptls(), on)
macro safe_lock(l, ex)
quote
temp = $(esc(l))
lock(temp)
enable_finalizers(false)
try
$(esc(ex))
finally
unlock(temp)
enable_finalizers(true)
end
end
end

const MEMDEBUG = ccall(:jl_is_memdebug, Bool, ())


Expand Down Expand Up @@ -80,7 +98,7 @@ function actual_alloc(bytes)
ptr = convert(CuPtr{Nothing}, buf)

# record the buffer
@lock memory_lock begin
@safe_lock memory_lock begin
@assert !haskey(allocated, ptr)
allocated[ptr] = buf
end
Expand All @@ -94,7 +112,7 @@ end

function actual_free(ptr::CuPtr{Nothing})
# look up the buffer
buf = @lock memory_lock begin
buf = @safe_lock memory_lock begin
buf = allocated[ptr]
delete!(allocated, ptr)
buf
Expand Down
Loading