Limit ArrowWriter Row Group Size by bytes in addition to rows #1213
Labels
enhancement
Any new improvement worthy of a entry in the changelog
parquet
Changes to the parquet crate
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently
ArrowWriter
usesmax_row_group_size
as a row count limit. Whilst this is significantly simpler to implement, it is at odds with other arrow implementations that use a bytes threshold.Describe the solution you'd like
Any or all of:
max_row_group_size
is used for and how it is different from the other size quantities in WriterPropertiesDEFAULT_MAX_ROW_GROUP_SIZE
of128 * 1024 * 1024
makes sense given this is not bytesThe text was updated successfully, but these errors were encountered: