-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove automatic buffering in ipc::reader::FileReader
for for consistent buffering
#6132
Remove automatic buffering in ipc::reader::FileReader
for for consistent buffering
#6132
Conversation
This is a breaking change and should be labelled with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @V0ldek -- I think this is a nice improvement. I had some small comment suggestions, but we could potentially do them as follow on PRs
Current writer API automatically wraps the supplied std::io::Writer impl into a BufWriter. It is cleaner and more idiomatic to have the default be using the supplied impl directly, as the user might already have a BufWriter or an impl that doesn't actually benefit from buffering at all. StreamReader does a similar thing, but it also exposes a `try_new_unbuffered` that bypasses the internal wrap. Here we propose a consistent and non-buffered by default API: - `try_new` does not wrap the passed reader/writer, - `try_new_buffered` is a convenience function that does wrap the reader/writer into a BufReader/BufWriter, - all four publicly exposed IPC reader/writers follow the above consistently, i.e. `StreamReader`, `FileReader`, `StreamWriter`, `FileWriter`. Those are breaking changes. An additional tweak: removed the generic type bounds from struct definitions on the four types, as that is the idiomatic Rust approach (see e.g. stdlib's HashMap that has no bounds on the struct definition, only the impl requires Hash + Eq). See apache#6099 for the discussion.
Applied a few suggestions, made `Error` sections more consistent.
f907e78
to
c4f08a9
Compare
ipc::reader::FileReader
for for consistent buffering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @V0ldek -- I think this is a very nice improvement
Which issue does this PR close?
Closes #6099.
Rationale for this change
Current writer API automatically wraps the supplied
std::io::Writer
impl into aBufWriter
.It is cleaner and more idiomatic to have the default be using the supplied impl directly, as the user might already have a
BufWriter
or an impl that doesn't actually benefit from buffering at all.StreamReader does a similar thing, but it also exposes a
try_new_unbuffered
that bypasses the internal wrap. These APIs are incosistent between reader/writer and also betweenStream
andFile
versions, but there's no strong reason for that.What changes are included in this PR?
Here we propose a consistent and non-buffered by default API:
try_new
does not wrap the passed reader/writer,try_new_buffered
is a convenience function that does wrap the reader/writer into a BufReader/BufWriter,StreamReader
,FileReader
,StreamWriter
,FileWriter
.An additional tweak: removed the generic type bounds from struct definitions on the four types, as that is the idiomatic Rust approach (see e.g. stdlib's
HashMap
that has no bounds on the struct definition, only the impl requiresHash + Eq
).Are there any user-facing changes?
arrow_ipc::reader::StreamReader<R>::try_new_unbuffered
is deprecated to be removed in a future version;arrow_ipc::reader::StreamReader<std::io::BufReader<R>>::try_new
is renamed totry_new_buffered
;arrow_ipc::reader::FileReader<std::io:::BufReader<R>>::try_new_buffered
is added;arrow_ipc::writer::StreamWriter<std::io::BufWriter<W>>::try_new_buffered
is added;arrow_ipc::writer::FileWriter<std::io::BufWriter<W>>::try_new_buffered
is added.The fact that
try_new
is now unbuffered is a silent change from the API perspective. A number of rustdocs explicitly referenced the internal buffering and have been reworded accordingly.