Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support buffered writing with the Arrow API #428

Merged
merged 4 commits into from
Mar 7, 2024

Conversation

adamreeve
Copy link
Contributor

This adds support for writing Arrow format data in buffered mode, so that multiple record batches can be written to the same Parquet row group.

This diverges a bit from the C++ API, which has WriteTable which starts a new row group, and WriteRecordBatch which is buffered. As we already have a WriteRecordBatch method that uses WriteTable internally, I've added a new WriteBufferedRecordBatch method.

@adamreeve adamreeve merged commit 010bdde into G-Research:master Mar 7, 2024
33 checks passed
@adamreeve adamreeve deleted the arrow_buffered_write branch March 7, 2024 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants