-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-41656: [MATLAB] Add C Data Interface format import/export functionality for arrow.array.Array
#41737
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments. But looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
+1 |
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 5809daf. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…nctionality for `arrow.array.Array` (apache#41737) ### Rationale for this change Now that apache#41653 and apache#41654 have been addressed, we should add MATLAB APIs for importing/exporting `arrow.array.Array` objects using the C Data Interface format. This pull request adds two new APIs for importing and exporting `arrow.array.Array` objects using the C Data Interface format. #### Example ```matlab >> expected = arrow.array([1, 2, 3]) expected = Float64Array with 3 elements and 0 null values: 1 | 2 | 3 >> cArray = arrow.c.Array() cArray = Array with properties: Address: 140341875084944 >> cSchema = arrow.c.Schema() cSchema = Schema with properties: Address: 140341880022320 % Export the Array to C Data Interface Format >> expected.export(cArray.Address, cSchema.Address) % Import the Array from C Data Interface Format >> actual = arrow.array.Array.import(cArray, cSchema) actual = Float64Array with 3 elements and 0 null values: 1 | 2 | 3 % The Array is the same after round-tripping to C Data Interface format >> isequal(actual, expected) ans = logical 1 ``` ### What changes are included in this PR? 1. Added new `arrow.array.Array.export(cArrowArrayAddress, cArrowSchemaAddress)` method for exporting `Array` objects to C Data Interface format. 2. Added new static `arrow.array.Array.import(cArray, cSchema)` method for importing `Array`s from C Data Interface format. 3. Added new internal `arrow.c.internal.ArrayImporter` class for importing `Array` objects from C Data Interface format. ### Are these changes tested? Yes. 1. Added new test file `matlab/test/arrow/c/tRoundTrip.m` with basic round-trip tests for importing/exporting `Array` objects using the C Data Interface format. ### Are there any user-facing changes? Yes. 1. There are now two new user-facing APIs added to the `arrow.array.Array` class. These are `arrow.array.Array.export(cArrowArrayAddress, cArrowSchemaAddress)` and `arrow.array.Array.import(cArray, cSchema)`. These APIs can be used to import/export `Array` objects using the C Data Interface format. ### Future Directions 1. Add integration tests for sharing data between MATLAB/mlarrow and Python/pyarrow running in the same process using the [MATLAB interface to Python](https://www.mathworks.com/help/matlab/call-python-libraries.html). 2. Add support for exporting/importing `arrow.tabular.RecordBatch` objects using the C Data Interface format. 3. Add support for the Arrow [C stream interface format](https://arrow.apache.org/docs/format/CStreamInterface.html). ### Notes 1. Thanks @ sgilmore10 for your help with this pull request! * GitHub Issue: apache#41656 Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: Kevin Gurney <kevin.p.gurney@gmail.com> Co-authored-by: Sutou Kouhei <kou@cozmixng.org> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Rationale for this change
Now that #41653 and #41654 have been addressed, we should add MATLAB APIs for importing/exporting
arrow.array.Array
objects using the C Data Interface format.This pull request adds two new APIs for importing and exporting
arrow.array.Array
objects using the C Data Interface format.Example
What changes are included in this PR?
arrow.array.Array.export(cArrowArrayAddress, cArrowSchemaAddress)
method for exportingArray
objects to C Data Interface format.arrow.array.Array.import(cArray, cSchema)
method for importingArray
s from C Data Interface format.arrow.c.internal.ArrayImporter
class for importingArray
objects from C Data Interface format.Are these changes tested?
Yes.
matlab/test/arrow/c/tRoundTrip.m
with basic round-trip tests for importing/exportingArray
objects using the C Data Interface format.Are there any user-facing changes?
Yes.
arrow.array.Array
class. These arearrow.array.Array.export(cArrowArrayAddress, cArrowSchemaAddress)
andarrow.array.Array.import(cArray, cSchema)
. These APIs can be used to import/exportArray
objects using the C Data Interface format.Future Directions
arrow.tabular.RecordBatch
objects using the C Data Interface format.Notes
arrow.array.Array
#41656