-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add copyto! and similar for Datasets #937
Conversation
src/views.jl
Outdated
parent::Dataset | ||
indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to store the dataspace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is worthwhile eventually, but I think we should defer this to a larger overhaul of Dataset
.
In particular, we should consider parameterizing Dataset
and making it an AbstractArray
or AbstractDiskArray
for version 0.17.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #930
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards removing the view feature from this pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea, but I would suggest making it actually a functional part: i.e. calling read!(dataset, array, I...)
would internally call read!(view(dataset, I...), array)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I just think the view
component is much more complicated than the other parts.
src/HDF5.jl
Outdated
Read [part of] a dataset or attribute into a preallocated output buffer. | ||
The output buffer must be convertible to a pointer and have a contiguous layout. | ||
""" | ||
function Base.read!(obj::DatasetOrAttribute, buf::AbstractArray{T}, I...) where T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why do we need this method? Seems like copyto!
alone would be sufficient as per the AbstractArray interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently implement Base.read
, so why not Base.read!
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should define this method, because we are moving more and more having the HDF5.Dataset
type satisfying the AbstractArray
interface. Users would expect copyto!
as that is the standard name for this function, thus additionally adding Base.read!
seems unnecessary.
I understand that we are defining methods on Base.read
throughout the HDF5
module, but I don't see that we should extend that and add Base.read!
The definition in IO/Networking, also suggests that this wouldn't be the best overload, since it's used by default for reading from a filename or stream.
https://docs.julialang.org/en/v1/base/io-network/#Base.read!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base.read!
has been removed.
src/HDF5.jl
Outdated
|
||
Return a `Array{T}` or `Matrix{UInt8}` to that can contain [part of] the dataset. | ||
""" | ||
function Base.similar(obj::DatasetOrAttribute, I::Integer...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question, why do we need to pass the indices here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base.similar
accepts an optional size value. These are not indices. I suppose we should rename the argument.
https://docs.julialang.org/en/v1/base/arrays/#Base.similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more specifically looking at the AbstractArray interface https://docs.julialang.org/en/v1/manual/interfaces/#man-interface-array, where there are 4 optional similar
methods that we could define:
Optional methods | Default definition | Brief description |
---|---|---|
similar(A) | similar(A, eltype(A), size(A)) | Return a mutable array with the same shape and element type |
similar(A, ::Type{S}) | similar(A, S, size(A)) | Return a mutable array with the same shape and the specified element type |
similar(A, dims::Dims) | similar(A, eltype(A), dims) | Return a mutable array with the same element type and size dims |
similar(A, ::Type{S}, dims::Dims) | Array{S}(undef, dims) | Return a mutable array with the specified element type and size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we do implement AbstractArray in 0.17, then we do not need this signature. We could just implement the version with Dims. However, since we are not implementing AbstractArray in 0.16, we probably do need to retain a method with the above signature since we do not have a similar(a::AbstractArray{T}, dims::Union{Integer, AbstractUnitRange}...)
to cover that case for us.
I will rename this to Base.similar(obj::DatasetOrAttribute, dims::Dims)
. We probably sill need Base.similar, dims::Integer...)
as well, until we actually do implement AbstractArray
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think read!
makes more sense as long as we're using read
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonbyrne did you intend to comment on the other thread? Is there a specific part of #937 (comment) you disagree with?
The goal is not to remove copyto!
and replace it with read!
. Originally, @mkitti had read!
as an alias for copyto!
, which we require in order to satisfy part of the AbstractArray
interface. My comment on read!
was that this was both redundant with copyto!
and unexpected since read!
is used in base Julia for reading from an IO stream/filename, which is very different than what copyto!
accomplishes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes, sorry...
src/HDF5.jl
Outdated
function Base.similar( | ||
obj::DatasetOrAttribute, | ||
::Type{T}, | ||
dims::Integer...; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not have dims::Dims
here and below?
src/HDF5.jl
Outdated
buf::Union{AbstractMatrix{UInt8}, AbstractArray{T}, Nothing}, I...) where T | ||
|
||
sz, scalar, dspace = _size_of_buffer(obj, I) | ||
memtype = _memtype(filetype, T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this below the buffer size checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rearranged this and added more try blocks so we can make sure to close out the handles.
Overall LGTM sans minor comments. Latest changes are much improved. |
I'm not entirely sure about this |
My plan after this is to revisit views of |
Awesome, thanks! The improvements and clean up to the generic read function are defintely welcome. |
Closes #580, #409
Here we implement
,read!
copyto!
, andsimilar
to allow data to be read into existing array buffers rather than allocating new buffers.