-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MATLAB] Subclass arrow::Buffer
to keep MATLAB data backing arrow::Arrays
alive
#36614
Comments
take |
sgilmore10
changed the title
[MATLAB] Store backing MATLAB array within C++ proxy objects instead of MATLAB classes
[MATLAB] Subclass Jul 12, 2023
arrow::Buffer
to keep MATLAB data backing arrow::Arrays
alive
kevingurney
pushed a commit
that referenced
this issue
Jul 12, 2023
… arrow::Arrays alive (#36615) ### Rationale for this change When building `arrow.array.<Numeric>Arrays` from native MATLAB arrays, we avoid copying by 1. Wrapping the MATLAB data inside a non-owning `arrow::Buffer` 2. Constructing an `arrow::ArrayData` object from the `arrow::Buffer` 3. Constructing the `arrow::Array` from the `arrow::Data` object. Because the `Array`'s underlying `Buffer` does not have ownership of its backing data, we have been storing the original MATLAB array as a property called `MatlabArray` on the MATLAB `arrow.array.NumericArray` class. This solution is not ideal because the backing MATLAB array is kept separate from the actual `std::shared_ptr<arrow::Array>`, which is stored within the C++ proxy objects (e.g. `arrow::matlab::array::proxy::NumericArray<float64>`). A better solution would be to create a new subclass of `arrow::Buffer` called `MatlabBuffer`, which will keep the original MATLAB array alive by taking it in as an input parameter and storing it as a member variable. ### What changes are included in this PR? 1. Removed the `MatlabArray` property from the MATLAB class `matlab.io.NumericArray` 2. Added a private member variable called `mda_array` to `arrow::matlab::array::proxy::NumericArray<CType>` and `arrow::matlab::array::proxy::TimestampArray`. `mda_array` is a `matlab::data::Array` object. This member variable stores the MATLAB array that owns the memory the `arrow::Array` objects are backed by. ### Are these changes tested? Existing tests used. ### Are there any user-facing changes? No. * Closes: #36614 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
chelseajonesr
pushed a commit
to chelseajonesr/arrow
that referenced
this issue
Jul 20, 2023
…acking arrow::Arrays alive (apache#36615) ### Rationale for this change When building `arrow.array.<Numeric>Arrays` from native MATLAB arrays, we avoid copying by 1. Wrapping the MATLAB data inside a non-owning `arrow::Buffer` 2. Constructing an `arrow::ArrayData` object from the `arrow::Buffer` 3. Constructing the `arrow::Array` from the `arrow::Data` object. Because the `Array`'s underlying `Buffer` does not have ownership of its backing data, we have been storing the original MATLAB array as a property called `MatlabArray` on the MATLAB `arrow.array.NumericArray` class. This solution is not ideal because the backing MATLAB array is kept separate from the actual `std::shared_ptr<arrow::Array>`, which is stored within the C++ proxy objects (e.g. `arrow::matlab::array::proxy::NumericArray<float64>`). A better solution would be to create a new subclass of `arrow::Buffer` called `MatlabBuffer`, which will keep the original MATLAB array alive by taking it in as an input parameter and storing it as a member variable. ### What changes are included in this PR? 1. Removed the `MatlabArray` property from the MATLAB class `matlab.io.NumericArray` 2. Added a private member variable called `mda_array` to `arrow::matlab::array::proxy::NumericArray<CType>` and `arrow::matlab::array::proxy::TimestampArray`. `mda_array` is a `matlab::data::Array` object. This member variable stores the MATLAB array that owns the memory the `arrow::Array` objects are backed by. ### Are these changes tested? Existing tests used. ### Are there any user-facing changes? No. * Closes: apache#36614 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
R-JunmingChen
pushed a commit
to R-JunmingChen/arrow
that referenced
this issue
Aug 20, 2023
…acking arrow::Arrays alive (apache#36615) ### Rationale for this change When building `arrow.array.<Numeric>Arrays` from native MATLAB arrays, we avoid copying by 1. Wrapping the MATLAB data inside a non-owning `arrow::Buffer` 2. Constructing an `arrow::ArrayData` object from the `arrow::Buffer` 3. Constructing the `arrow::Array` from the `arrow::Data` object. Because the `Array`'s underlying `Buffer` does not have ownership of its backing data, we have been storing the original MATLAB array as a property called `MatlabArray` on the MATLAB `arrow.array.NumericArray` class. This solution is not ideal because the backing MATLAB array is kept separate from the actual `std::shared_ptr<arrow::Array>`, which is stored within the C++ proxy objects (e.g. `arrow::matlab::array::proxy::NumericArray<float64>`). A better solution would be to create a new subclass of `arrow::Buffer` called `MatlabBuffer`, which will keep the original MATLAB array alive by taking it in as an input parameter and storing it as a member variable. ### What changes are included in this PR? 1. Removed the `MatlabArray` property from the MATLAB class `matlab.io.NumericArray` 2. Added a private member variable called `mda_array` to `arrow::matlab::array::proxy::NumericArray<CType>` and `arrow::matlab::array::proxy::TimestampArray`. `mda_array` is a `matlab::data::Array` object. This member variable stores the MATLAB array that owns the memory the `arrow::Array` objects are backed by. ### Are these changes tested? Existing tests used. ### Are there any user-facing changes? No. * Closes: apache#36614 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
When building
arrow.array.<Numeric>Arrays
from native MATLAB arrays, we avoid copying byarrow::Buffer
arrow::ArrayData
object from thearrow::Buffer
arrow::Array
from thearrow::Data
object.Because the array's buffer does not own its backing data, we have been storing the original MATLAB array as a property called
MatlabArray
on the MATLABarrow.array.<Numeric>Arrays
classes.This solution is not ideal because the backing MATLAB array is kept separate from the actual
std::shared_ptr<arrow::Array>
, which is stored within the C++ proxy objects (e.g.arrow::matlab::array::proxy::NumericArray<float64>
). A better solution would be to create a new subclass ofarrow::Buffer
calledMatlabBuffer
.MatlabBuffer
will keep the original MATLAB array in memory.Component(s)
MATLAB
The text was updated successfully, but these errors were encountered: