Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add compute kernels that operate on ArrayRef -- e.g eq_dyn #843

Closed
alamb opened this issue Oct 21, 2021 · 2 comments · Fixed by #858
Closed

add compute kernels that operate on ArrayRef -- e.g eq_dyn #843

alamb opened this issue Oct 21, 2021 · 2 comments · Fixed by #858
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog

Comments

@alamb
Copy link
Contributor

alamb commented Oct 21, 2021

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I want to compare two ArrayRefs (Arc<Array>) with each other

Arrow offers compute kernels for primitive (numeric) types(e.g. eq), and separate ones for string types (e.g. eq_utf8).
https://docs.rs/arrow/6.0.0/arrow/compute/kernels/comparison/index.html

This means I have to match on the datatype and then dispatch to the appropriate implementation

You can see a bunch of this kind of dispatch (encoded in macros) in Datafusion, for example https://github.com/apache/arrow-datafusion/blob/81fae230b81b97e93bbb95b284cfda6f4d59552e/datafusion/src/physical_plan/expressions/binary.rs#L289-L328

Describe the solution you'd like
Add functions like OP_dyn that did the type dispatch within arrow

So for example, a function like this:

/// Compare the left and right arrays for equality. Returns an error if they are not the exact same type
fn eq_dyn(left: &dyn Array, right: &dyn Array) -> result<BooleanArray> {
  // switch on types and call appropriate eq kernel
...
}

and neq_dyn, lt_dyn, etc.

Describe alternatives you've considered
Leave it as is

Additional context
See also apache/datafusion#1163 and #842

@alamb alamb added arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog labels Oct 21, 2021
@sum12
Copy link
Contributor

sum12 commented Oct 23, 2021

I guess the more the better, but this is the complete list ?

@alamb
Copy link
Contributor Author

alamb commented Oct 24, 2021

I guess the more the better, but this is the complete list ?

I was thinking more like the kernels in arrow itself:
https://docs.rs/arrow/6.0.0/arrow/compute/kernels/index.html

So definitely for the comparison kernels listed
https://docs.rs/arrow/6.0.0/arrow/compute/kernels/comparison/index.html

But also it may make sense for the aggregate kernels https://docs.rs/arrow/6.0.0/arrow/compute/kernels/comparison/index.html

I took a quick look through the rest of the kernels and most of them either already operate on dyn Array (e.g. cast) or they only logically make sense on certain types of arrays (e.g. the Boolean kernels). Though even for boolean kernels it might be convenient to offer a function that does the downcasting appropriately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants