Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MutableBuffer::typed_data - shared ref access to the typed slice #2652

Merged
merged 1 commit into from
Sep 5, 2022

Conversation

medwards
Copy link
Contributor

@medwards medwards commented Sep 5, 2022

Which issue does this PR close?

Didn't open an issue, let me know if you still need it for the changelog

Rationale for this change

It's possible to run into ownership difficulties if you want to iterate over a mutable buffers typed data because it only offers a mutable slice which forces exclusive ownership. Provide an immutable slice that can be used as a shared reference.

What changes are included in this PR?

Simple copy-pasta of Buffer::typed_data into MutableBuffer and updates to docstrings to match.

Are there any user-facing changes?

New method in MutableBuffer. Not a breaking change.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Sep 5, 2022
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Btw I would be interested in how you're using MutableBuffer, it is quite a low-level interface and makes me wonder if we're missing a higher level API somewhere?

@medwards
Copy link
Contributor Author

medwards commented Sep 5, 2022

In our case we're constructing a dictionary (UTF8) column by hand because we can parallelize it when we know it is unique, but they have to be mutable buffers because we're not sure how big the chunks might be, so they need to be growable. Later we can just concatenate the values, but the offsets need to be read out and corrected for merging (this is when we want MutableBuffer::typed_data).

@tustvold
Copy link
Contributor

tustvold commented Sep 5, 2022

but the offsets need to be read out and corrected for merging

In the short term until this is released you could potentially consider converting to Buffer at that point FWIW which has a typed_data method.

@tustvold tustvold merged commit 43d8474 into apache:master Sep 5, 2022
@ursabot
Copy link

ursabot commented Sep 5, 2022

Benchmark runs are scheduled for baseline = dc4ccf8 and contender = 43d8474. 43d8474 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@medwards
Copy link
Contributor Author

medwards commented Sep 5, 2022

I tried that and there were no trivial conversions that didn't take ownership of the buffer (which then runs me into the same problem I was having /w typed_data_mut). The ownership in my case is complicated because all the buffers are in a collection. There may be a smarter way of doing this, for example draining the collection or something else but I was reaching for typed_data /wo thinking about it and was surprised it wasn't paired here since other collection apis tend to bundle get + get_mut type functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants