Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add datatype new_list #4561

Merged
merged 2 commits into from
Aug 2, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion arrow-schema/src/datatype.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
use std::fmt;
use std::sync::Arc;

use crate::{FieldRef, Fields, UnionFields};
use crate::{FieldRef, Fields, UnionFields, Field};

/// The set of datatypes that are supported by this implementation of Apache Arrow.
///
Expand Down Expand Up @@ -576,6 +576,11 @@ impl DataType {
_ => self == other,
}
}

/// Create a List DataType default name is "item"
pub fn new_list(data_type: DataType, nullable: bool) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for consistency with Field::new_list we should take the item name as an argument. Thoughts @alamb ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think we should use the standard / convention "item" always (not take a new parameter) to encourage people to follow the convention. If people want a different name, they can construct the DataType directly.

In terms of consistency with, https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html#method.new_list I would recommend we remove the name parameter from that method, due to the same rationale above

Copy link
Contributor

@tustvold tustvold Jul 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend we remove the name parameter from that method

It can't be removed, it is part of the field parameter that is passed in, which needs to be kept for nullability, metadata, etc... I just think it will be more confusing for users if some APIs specify "item" and some don't

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just think it will be more confusing for users if some APIs specify "item" and some don't

I don't think it will increase confusion as it is already there: "I am making a list, what name should I use for its child" when the name never is used

To reduce the confusion I think we should reduce the need for specifying that field name ever when making lists

return DataType::List(Arc::new(Field::new("item", data_type, nullable)));
}
}

/// The maximum precision for [DataType::Decimal128] values
Expand Down
Loading