Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when saving fully sparse array #5

Closed
tbenst opened this issue Aug 17, 2021 · 3 comments
Closed

error when saving fully sparse array #5

tbenst opened this issue Aug 17, 2021 · 3 comments

Comments

@tbenst
Copy link
Contributor

tbenst commented Aug 17, 2021

julia> h5_path = tempname()*".h5"
julia> data = zeros(100,1)
julia> H5SparseMatrixCSC(h5_path, "test", sparse(data))
HDF5-DIAG: Error detected in HDF5 (1.12.0) thread 0:
  #000: H5Pdcpl.c line 2004 in H5Pset_chunk(): chunk dimensionality must be positive
    major: Invalid arguments to routine
    minor: Out of range
ERROR: LoadError: Error setting chunk size
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] h5p_set_chunk
    @ ~/.julia/packages/HDF5/0iEnL/src/api.jl:1211 [inlined]
  [3] set_chunk(::HDF5.Properties)
    @ HDF5 ~/.julia/packages/HDF5/0iEnL/src/HDF5.jl:1915
  [4] _prop_set!(p::HDF5.Properties, name::Symbol, val::Vector{Int64}, check::Bool)
    @ HDF5 ~/.julia/packages/HDF5/0iEnL/src/HDF5.jl:837
  [5] create_property(class::Int64; pv::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:chunk, :blosc), Tuple{Vector{Int64}, Int64}}})
    @ HDF5 ~/.julia/packages/HDF5/0iEnL/src/HDF5.jl:865
  [6] create_dataset(parent::HDF5.Group, path::String, dtype::HDF5.Datatype, dspace::HDF5.Dataspace; pv::Base.Iterators.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:chunk, :blosc), Tuple{Vector{Int64}, Int64}}})
    @ HDF5 ~/.julia/packages/HDF5/0iEnL/src/HDF5.jl:729
  [7] #create_dataset#32
    @ ~/.julia/packages/HDF5/0iEnL/src/HDF5.jl:737 [inlined]
  [8] h5writecsc(fid::HDF5.File, name::String, m::Int64, n::Int64, colptr::Vector{Int64}, rowval::Vector{Int64}, nzval::Vector{Float64}; overwrite::Bool, chunk::Nothing, blosc::Int64, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ H5Sparse ~/.julia/packages/H5Sparse/oVw2O/src/H5Sparse.jl:250
  [9] h5writecsc
    @ ~/.julia/packages/H5Sparse/oVw2O/src/H5Sparse.jl:231 [inlined]
 [10] #h5writecsc#3
    @ ~/.julia/packages/H5Sparse/oVw2O/src/H5Sparse.jl:228 [inlined]
 [11] h5writecsc
    @ ~/.julia/packages/H5Sparse/oVw2O/src/H5Sparse.jl:227 [inlined]
 [12] H5SparseMatrixCSC(fid::HDF5.File, name::String, B::SparseMatrixCSC{Float64, Int64}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ H5Sparse ~/.julia/packages/H5Sparse/oVw2O/src/H5Sparse.jl:86
 [13] H5SparseMatrixCSC(fid::HDF5.File, name::String, B::SparseMatrixCSC{Float64, Int64})
    @ H5Sparse ~/.julia/packages/H5Sparse/oVw2O/src/H5Sparse.jl:86
 [14] H5SparseMatrixCSC(::String, ::String, ::Vararg{Any, N} where N; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ H5Sparse ~/.julia/packages/H5Sparse/oVw2O/src/H5Sparse.jl:84
 [15] H5SparseMatrixCSC(::String, ::String, ::Vararg{Any, N} where N)
    @ H5Sparse ~/.julia/packages/H5Sparse/oVw2O/src/H5Sparse.jl:84
 [16] top-level scope
    @ ~/code/lensman/notebooks/debug2.jl:9
in expression starting at /home/tyler/code/lensman/notebooks/debug2.jl:9
@severinson
Copy link
Owner

Good catch. Seems like HDF5.heuristic_chunk returns Int64[] when called with a zero-length vector. We could replace this with Int64[1] for zero-length vectors, which would solve the issue. Performance would be terrible if more data is appended to this matrix later, but it's not clear what would be a better default than Int64[1].

@tbenst
Copy link
Contributor Author

tbenst commented Aug 17, 2021

I think [] is / is intended to be a valid chunk: https://github.com/JuliaIO/HDF5.jl/blob/56c795cf876bcaeb37c9143fcccec01cbf0b1f30/src/HDF5.jl#L357

@severinson
Copy link
Owner

It looks like it, but create_group fails when chunk=[]. I've made a series of commits that address this issue, and allows for writing matrices with no stored entries to disk, by adding some code to handle empty vectors. Saving matrices with zero dimension still doesn't work though (e.g., sprand(0, 0)).
541781c acefb07 501335e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants