Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove DictionaryArray::keys_array method #419

Merged

Conversation

jhorstmann
Copy link
Contributor

Which issue does this PR close?

Closes #391.

Rationale for this change

The keys_array can be a performance problem if used in inner loops and most usages can be directly replaced by the keys method which returns a reference without any cloning or allocations.

What changes are included in this PR?

Removed the method and replaced it's usages with the keys method.

Are there any user-facing changes?

Yes, this is a breaking change.

This is currently a draft because there were actually two usages in the cast kernels that could not be directly replaced. We do not seem to have an obvious way to get from &PrimitiveArray<T> to ArrayRef. This transformation would be much easier if the PrimitiveArray itself was cloneable.

@@ -1450,7 +1451,8 @@ where
cast_with_options(&dict_array.values(), to_type, cast_options)?;

// Note take requires first casting the indices to u32
let keys_array: ArrayRef = Arc::new(dict_array.keys_array());
let keys_array: ArrayRef =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If primitive arrays were cloneable this would just be Arc::new(dict_array.keys().clone()).

@codecov-commenter
Copy link

Codecov Report

Merging #419 (a74034f) into master (ddc8a36) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #419      +/-   ##
==========================================
- Coverage   82.62%   82.62%   -0.01%     
==========================================
  Files         162      162              
  Lines       44479    44469      -10     
==========================================
- Hits        36750    36741       -9     
+ Misses       7729     7728       -1     
Impacted Files Coverage Δ
arrow/src/array/array_dictionary.rs 84.56% <100.00%> (-0.32%) ⬇️
arrow/src/array/builder.rs 85.29% <100.00%> (ø)
arrow/src/array/ord.rs 62.00% <100.00%> (ø)
arrow/src/compute/kernels/cast.rs 94.53% <100.00%> (ø)
arrow/src/compute/kernels/sort.rs 95.12% <100.00%> (ø)
arrow/src/compute/kernels/take.rs 94.08% <100.00%> (ø)
arrow/src/util/display.rs 29.72% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ddc8a36...a74034f. Read the comment docs.

@alamb alamb added api-change Changes to the arrow API arrow Changes to the arrow crate labels Jun 7, 2021
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think this looks like an improvement: makes it harder to mis-use the Arrow API 👍

@alamb
Copy link
Contributor

alamb commented Jun 8, 2021

@andygrove @Dandandan @jorgecarleitao @paddyhoran @tustvold @nevi-me any thoughts on this proposal?

@jhorstmann jhorstmann marked this pull request as ready for review June 9, 2021 07:13
@alamb
Copy link
Contributor

alamb commented Jun 9, 2021

I'll wait a few more days before merging this in to see if there is any more feedback.

As this is not compatible, I won't backport to the 4.x line

@alamb alamb merged commit f624153 into apache:master Jun 12, 2021
@alamb alamb added api-change Changes to the arrow API and removed api-change Changes to the arrow API labels Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove DictionaryArray::keys_array
5 participants