Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Improve docs for arrow-ipc, remove clippy ignore #3421

Merged
merged 4 commits into from
Jan 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions arrow-ipc/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,9 @@
// specific language governing permissions and limitations
// under the License.

//! Support for the Arrow IPC format

// TODO: (vcq): Protobuf codegen is not generating Debug impls.
alamb marked this conversation as resolved.
Show resolved Hide resolved
#![allow(missing_debug_implementations)]
//! Support for the [Arrow IPC Format]
//!
//! [Arrow IPC Format]: https://arrow.apache.org/docs/format/Columnar.html#serialization-and-interprocess-communication-ipc

pub mod convert;
pub mod reader;
Expand Down
32 changes: 32 additions & 0 deletions arrow-ipc/src/writer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,38 @@ impl Default for IpcWriteOptions {
}

#[derive(Debug, Default)]
/// Handles low level details of encoding [`Array`] and [`Schema`] into the
/// [Arrow IPC Format].
///
/// # Example:
/// ```
/// # fn run() {
/// # use std::sync::Arc;
/// # use arrow_array::UInt64Array;
/// # use arrow_array::RecordBatch;
/// # use arrow_ipc::writer::{DictionaryTracker, IpcDataGenerator, IpcWriteOptions};
///
/// // Create a record batch
/// let batch = RecordBatch::try_from_iter(vec![
/// ("col2", Arc::new(UInt64Array::from_iter([10, 23, 33])) as _)
/// ]).unwrap();
///
/// // Error of dictionary ids are replaced.
/// let error_on_replacement = true;
/// let options = IpcWriteOptions::default();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface is somewhat unfortunate (like there is no state on IpcDataGenerator and the state is passed in via DictionaryTracker).

While implementing #3389 I hope to provide something that handles all the encoding state so this interface will remain low level

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the state was at one point on IpcDataGenerator and it got factored out for some reason 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷

/// let mut dictionary_tracker = DictionaryTracker::new(error_on_replacement);
///
/// // encode the batch into zero or more encoded dictionaries
/// // and the data for the actual array.
/// let data_gen = IpcDataGenerator {};
/// let (encoded_dictionaries, encoded_message) = data_gen
/// .encoded_batch(&batch, &mut dictionary_tracker, &options)
/// .unwrap();
/// # }
/// ```
///
/// [Arrow IPC Format]: https://arrow.apache.org/docs/format/Columnar.html#serialization-and-interprocess-communication-ipc

pub struct IpcDataGenerator {}

impl IpcDataGenerator {
Expand Down