Skip to content

Commit

Permalink
apacheGH-36652: [MATLAB] Initialize the Type property of `arrow.arr…
Browse files Browse the repository at this point in the history
…ay.Array` subclasses from existing proxy ids (apache#36731)

### Rationale for this change

Now that the issue apache#36363 is closed via PR apache#36419, we can initialize the `Type` property of `arrow.array.Array` subclasses from existing proxy ids. Currently, we create a new proxy `Type` object whose underlying `arrow::DataType` are semantically equal to  - but not the same as - the `arrow::DataType` owned by the Array proxy. It would be preferable if the `Type` and `Array` proxy classes refer to the same `arrow::DataType` object (i.e. the same object on the heap).

### What changes are included in this PR?

1. Upgraded `libmexclass` to commit [d04f88d](mathworks/libmexclass@d04f88d). In this commit, we added a static "make-like" function to `Proxy` called `create`.
2. Modified the constructors of all `Type` objects to expect a single `Proxy` object as input. This is a breaking change and  means clients are no longer expected to build `Type` objects via their constructors. Instead, we introduced standalone functions that clients can use to construct `Type` objects, i.e.   `arrow.type.int8`, `arrow.type.string`, `arrow.type.timestamp`, etc. These functions deal with creating the `Proxy` objects to pass to the `Type` constructors. Below is an example of the new workflow for creating `Type` objects. 

```matlab
>> timestampType = arrow.type.timestamp(TimeUnit="second", TimeZone="America/New_York")

timestampType = 

  TimestampType with properties:

    ID: Timestamp
```
NOTE: We plan on enhancing the display to show the `TimeUnit` and `TimeZone` properties. 

3. Made `Type` a [dependent](https://www.mathworks.com/help/matlab/matlab_oop/access-methods-for-dependent-properties.html) property on `arrow.array.Array`. The `get.Type` method constructs a `Type` object on demand by making a proxy that wraps the same `arrow::DataType` object stored within the `arrow::Array`.

### Are these changes tested?

Yes, updated existing tests.

### Are there any user-facing changes?

Yes, we added new standalone functions for creating `Type` objects. Below is a table mapping standalone  functions to the `Type` object they output: 

| Standalone Function | Output Type Object |
|----------------------|---------------------|
|`arrow.type.boolean`| `arrow.type.BooleanType`|
|`arrow.type.int8`| `arrow.type.Int8Type`|
|`arrow.type.int16`| `arrow.type.Int16Type`|
|`arrow.type.int32`| `arrow.type.Int32Type`|
|`arrow.type.int64`| `arrow.type.Int64Type`|
|`arrow.type.uint8`| `arrow.type.UInt8Type`|
|`arrow.type.uint16`| `arrow.type.UInt16Type`|
|`arrow.type.uint32`| `arrow.type.UInt32Type`|
|`arrow.type.uint64`| `arrow.type.UInt64Type`|
|`arrow.type.string`| `arrow.type.StringType`|
|`arrow.type.timestamp`| `arrow.type.TimestampType`|

### Notes

Thanks @ kevingurney for the advice!
* Closes: apache#36652

Authored-by: Sarah Gilmore <sgilmore@mathworks.com>
Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
  • Loading branch information
sgilmore10 authored and R-JunmingChen committed Aug 20, 2023
1 parent ab1ddac commit 4c8f9be
Show file tree
Hide file tree
Showing 80 changed files with 560 additions and 141 deletions.
21 changes: 20 additions & 1 deletion matlab/src/cpp/arrow/matlab/array/proxy/array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@
#include "arrow/matlab/array/proxy/array.h"
#include "arrow/matlab/bit/unpack.h"
#include "arrow/matlab/error/error.h"
#include "arrow/type_traits.h"
#include "arrow/visit_array_inline.h"

#include "libmexclass/proxy/ProxyManager.h"

namespace arrow::matlab::array::proxy {

Expand All @@ -30,6 +34,7 @@ namespace arrow::matlab::array::proxy {
REGISTER_METHOD(Array, toMATLAB);
REGISTER_METHOD(Array, length);
REGISTER_METHOD(Array, valid);
REGISTER_METHOD(Array, type);
}

std::shared_ptr<arrow::Array> Array::getArray() {
Expand Down Expand Up @@ -69,4 +74,18 @@ namespace arrow::matlab::array::proxy {
auto valid_elements_mda = bit::unpack(validity_bitmap, array_length);
context.outputs[0] = valid_elements_mda;
}
}

void Array::type(libmexclass::proxy::method::Context& context) {
namespace mda = ::matlab::data;

mda::ArrayFactory factory;

auto type_proxy = typeProxy();
auto type_id = type_proxy->unwrap()->id();
auto proxy_id = libmexclass::proxy::ProxyManager::manageProxy(type_proxy);

context.outputs[0] = factory.createScalar(proxy_id);
context.outputs[1] = factory.createScalar(static_cast<int64_t>(type_id));

}
}
5 changes: 5 additions & 0 deletions matlab/src/cpp/arrow/matlab/array/proxy/array.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#pragma once

#include "arrow/array.h"
#include "arrow/matlab/type/proxy/type.h"

#include "libmexclass/proxy/Proxy.h"

Expand All @@ -39,8 +40,12 @@ class Array : public libmexclass::proxy::Proxy {

void valid(libmexclass::proxy::method::Context& context);

void type(libmexclass::proxy::method::Context& context);

virtual void toMATLAB(libmexclass::proxy::method::Context& context) = 0;

virtual std::shared_ptr<type::proxy::Type> typeProxy() = 0;

std::shared_ptr<arrow::Array> array;
};

Expand Down
7 changes: 7 additions & 0 deletions matlab/src/cpp/arrow/matlab/array/proxy/boolean_array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
// under the License.

#include "arrow/matlab/array/proxy/boolean_array.h"
#include "arrow/matlab/type/proxy/primitive_ctype.h"

#include "arrow/matlab/error/error.h"
#include "arrow/matlab/bit/pack.h"
Expand Down Expand Up @@ -54,4 +55,10 @@ namespace arrow::matlab::array::proxy {
context.outputs[0] = logical_array_mda;
}

std::shared_ptr<type::proxy::Type> BooleanArray::typeProxy() {
using BooleanTypeProxy = type::proxy::PrimitiveCType<bool>;

auto type = std::static_pointer_cast<arrow::BooleanType>(array->type());
return std::make_shared<BooleanTypeProxy>(std::move(type));
}
}
2 changes: 2 additions & 0 deletions matlab/src/cpp/arrow/matlab/array/proxy/boolean_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ namespace arrow::matlab::array::proxy {
protected:
void toMATLAB(libmexclass::proxy::method::Context& context) override;

std::shared_ptr<type::proxy::Type> typeProxy() override;

};

}
9 changes: 9 additions & 0 deletions matlab/src/cpp/arrow/matlab/array/proxy/numeric_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
#include "arrow/type_traits.h"

#include "arrow/matlab/array/proxy/array.h"
#include "arrow/matlab/type/proxy/primitive_ctype.h"

#include "arrow/matlab/error/error.h"
#include "arrow/matlab/bit/pack.h"
#include "arrow/matlab/bit/unpack.h"
Expand Down Expand Up @@ -79,6 +81,13 @@ class NumericArray : public arrow::matlab::array::proxy::Array {
::matlab::data::TypedArray<CType> result = factory.createArray({num_elements, 1}, data_begin, data_end);
context.outputs[0] = result;
}

std::shared_ptr<type::proxy::Type> typeProxy() override {
using ArrowTypeProxy = type::proxy::PrimitiveCType<CType>;
auto type = std::static_pointer_cast<ArrowType>(array->type());
return std::make_shared<ArrowTypeProxy>(std::move(type));
}

};

}
8 changes: 8 additions & 0 deletions matlab/src/cpp/arrow/matlab/array/proxy/string_array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
// under the License.

#include "arrow/matlab/array/proxy/string_array.h"
#include "arrow/matlab/type/proxy/string_type.h"

#include "arrow/array/builder_binary.h"

Expand Down Expand Up @@ -81,4 +82,11 @@ namespace arrow::matlab::array::proxy {
context.outputs[0] = array_mda;
}

std::shared_ptr<type::proxy::Type> StringArray::typeProxy() {
using StringTypeProxy = type::proxy::StringType;

auto type = std::static_pointer_cast<arrow::StringType>(array->type());
return std::make_shared<StringTypeProxy>(std::move(type));
}

}
3 changes: 3 additions & 0 deletions matlab/src/cpp/arrow/matlab/array/proxy/string_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ namespace arrow::matlab::array::proxy {

protected:
void toMATLAB(libmexclass::proxy::method::Context& context) override;

std::shared_ptr<type::proxy::Type> typeProxy() override;

};

}
8 changes: 8 additions & 0 deletions matlab/src/cpp/arrow/matlab/array/proxy/timestamp_array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
// under the License.

#include "arrow/matlab/array/proxy/timestamp_array.h"
#include "arrow/matlab/type/proxy/timestamp_type.h"

#include "arrow/matlab/error/error.h"
#include "arrow/matlab/bit/pack.h"
Expand Down Expand Up @@ -88,4 +89,11 @@ namespace arrow::matlab::array::proxy {
mda::TypedArray<int64_t> result = factory.createArray({num_elements, 1}, data_begin, data_end);
context.outputs[0] = result;
}

std::shared_ptr<type::proxy::Type> TimestampArray::typeProxy() {
using TimestampProxyType = type::proxy::TimestampType;
auto type = std::static_pointer_cast<arrow::TimestampType>(array->type());
return std::make_shared<TimestampProxyType>(std::move(type));

}
}
3 changes: 3 additions & 0 deletions matlab/src/cpp/arrow/matlab/array/proxy/timestamp_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ class TimestampArray : public arrow::matlab::array::proxy::Array {

protected:
void toMATLAB(libmexclass::proxy::method::Context& context) override;

std::shared_ptr<type::proxy::Type> typeProxy() override;

};

}
9 changes: 8 additions & 1 deletion matlab/src/matlab/+arrow/+array/Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
Valid % Validity bitmap
end

properties(Abstract, SetAccess=private, GetAccess=public)
properties(Dependent, SetAccess=private, GetAccess=public)
Type(1, 1) arrow.type.Type
end

Expand All @@ -46,6 +46,13 @@
function matlabArray = toMATLAB(obj)
matlabArray = obj.Proxy.toMATLAB();
end

function type = get.Type(obj)
[proxyID, typeID] = obj.Proxy.type();
traits = arrow.type.traits.traits(arrow.type.ID(typeID));
proxy = libmexclass.proxy.Proxy(Name=traits.TypeProxyClassName, ID=proxyID);
type = traits.TypeConstructor(proxy);
end
end

methods (Access = private)
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/BooleanArray.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitionValue = false;
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.BooleanType
end

methods
function obj = BooleanArray(data, opts)
arguments
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/Float32Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = single(NaN);
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.Float32Type
end

methods
function obj = Float32Array(data, varargin)
obj@arrow.array.NumericArray(data, "single", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/Float64Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = NaN;
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.Float64Type
end

methods
function obj = Float64Array(data, varargin)
obj@arrow.array.NumericArray(data, "double", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/Int16Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = int16(0)
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.Int16Type
end

methods
function obj = Int16Array(data, varargin)
obj@arrow.array.NumericArray(data, "int16", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/Int32Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = int32(0)
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.Int32Type
end

methods
function obj = Int32Array(data, varargin)
obj@arrow.array.NumericArray(data, "int32", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/Int64Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = int64(0);
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.Int64Type
end

methods
function obj = Int64Array(data, varargin)
obj@arrow.array.NumericArray(data, "int64", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/Int8Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = int8(0);
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.Int8Type
end

methods
function obj = Int8Array(data, varargin)
obj@arrow.array.NumericArray(data, "int8", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/StringArray.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitionValue = string(missing);
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.StringType
end

methods
function obj = StringArray(data, opts)
arguments
Expand Down
5 changes: 0 additions & 5 deletions matlab/src/matlab/+arrow/+array/TimestampArray.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = NaT;
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.TimestampType % temporarily default value
end

methods
function obj = TimestampArray(data, opts)
arguments
Expand All @@ -39,7 +35,6 @@

args = struct(MatlabArray=ptime, Valid=validElements, TimeZone=timezone, TimeUnit=string(opts.TimeUnit));
obj@arrow.array.Array("Name", "arrow.array.proxy.TimestampArray", "ConstructorArguments", {args});
obj.Type = arrow.type.TimestampType(TimeUnit=opts.TimeUnit, TimeZone=timezone);
end

function dates = toMATLAB(obj)
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/UInt16Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = uint16(0)
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.UInt16Type
end

methods
function obj = UInt16Array(data, varargin)
obj@arrow.array.NumericArray(data, "uint16", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/UInt32Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = uint32(0)
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.UInt32Type
end

methods
function obj = UInt32Array(data, varargin)
obj@arrow.array.NumericArray(data, "uint32", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/UInt64Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = uint64(0)
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.UInt64Type
end

methods
function obj = UInt64Array(data, varargin)
obj@arrow.array.NumericArray(data, "uint64", ...
Expand Down
4 changes: 0 additions & 4 deletions matlab/src/matlab/+arrow/+array/UInt8Array.m
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@
NullSubstitutionValue = uint8(0)
end

properties(SetAccess=private, GetAccess=public)
Type = arrow.type.UInt8Type
end

methods
function obj = UInt8Array(data, varargin)
obj@arrow.array.NumericArray(data, "uint8", ...
Expand Down
25 changes: 25 additions & 0 deletions matlab/src/matlab/+arrow/+internal/+proxy/create.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
% Licensed to the Apache Software Foundation (ASF) under one or more
% contributor license agreements. See the NOTICE file distributed with
% this work for additional information regarding copyright ownership.
% The ASF licenses this file to you under the Apache License, Version
% 2.0 (the "License"); you may not use this file except in compliance
% with the License. You may obtain a copy of the License at
%
% http://www.apache.org/licenses/LICENSE-2.0
%
% Unless required by applicable law or agreed to in writing, software
% distributed under the License is distributed on an "AS IS" BASIS,
% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
% implied. See the License for the specific language governing
% permissions and limitations under the License.

function proxy = create(name, args)
%CREATE Creates a proxy object.
arguments
name(1, 1) string {mustBeNonmissing}
end
arguments(Repeating)
args
end
proxy = libmexclass.proxy.Proxy.create(name, args{:});
end
Loading

0 comments on commit 4c8f9be

Please sign in to comment.