-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: initial implementation of table features. #1755
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
FYI, there is already an open PR on this: #1754 |
Thanks for opening this! It seems like this PR puts us on the right path. From what I can tell, it captures all the complexities of table features (e.g. It looks like we'll be able to use these to give a better user experience. Here's the current error when you try to read a Delta table with deletion vectors enabled: "DeltaProtocolError: The table's minimum reader version is 3but deltalake only supports up to version 1." We'd much rather give an error message like this: "You are trying to read a Delta table with deletion vectors enabled, but read support for deletion vectors isn't supported yet". It looks like there was a similar PR opened around the same time, which is also good, but this PR captures the nitty-gritty details that are important for us. Whenever the transaction log protocol is updated, we'll need to manually update this file as well. That's pretty annoying, but I don't think there is any way around that. |
Generally the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just glanced through the PR. One comment about liquid
table feature.
rust/src/protocol/mod.rs
Outdated
ICEBERG_COMPAT_V1, | ||
/// liquid clustering support | ||
#[serde(alias = "liquid")] | ||
LIQUID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a liquid
table feature? I don't see it in the protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only added it because I saw a test that read a liquid table and that table had liquid in the table features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this looks reasonable. My one request is we make it so it's not an immediate parsing errors (and definitely not a panic) if there are features we don't recognize. We should expect there will be new features added, and we should give an appropriate error (ProtocolNotSupported
or something like that) when this happens.
rust/src/protocol/mod.rs
Outdated
/// Features table readers can support as well as let users know | ||
/// what is supported | ||
#[derive(Serialize, Deserialize, Debug, Copy, Clone, Eq, PartialEq, Hash)] | ||
pub enum ReaderFeatures { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having these features, but it does mean we error when we see a feature we don't recognize when parsing, right? There is a serde::other
tag that I think we could use here. (docs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I like that lemme update for that
"deletionVectors" => ReaderFeatures::DELETION_VECTORS, | ||
"timestampNtz" => ReaderFeatures::TIMESTAMP_WITHOUT_TIMEZONE, | ||
"v2Checkpoint" => ReaderFeatures::V2_CHECKPOINT, | ||
f => panic!("Unknown reader feature encountered: {}", f), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should warrant a panic. Just unsupported table error. I'd convert this into some Other
variant and maybe let the error be handled elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on the not having these functions panic.
rust/tests/read_delta_log_test.rs
Outdated
assert!(_table.get_state().reader_features().is_some()); | ||
assert!(_table.get_state().writer_features().is_some()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to assert the specific features here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have more/better tests I'm going to check in prolly tomorrow.
I was doing a merge from main and did this, apologies re-opening another PR shortly. |
# Description Hi again, this is table features again, but albeit with less work done on the main branch like the smart guy I am. # Related Issue(s) #1755 prior review and information # Documentation https://delta.io/blog/2023-07-27-delta-lake-table-features/ https://github.com/delta-io/delta/blob/master/PROTOCOL.md#table-features --------- Co-authored-by: Stephen Carman <stephen.carman@databricks.com>
# Description Hi again, this is table features again, but albeit with less work done on the main branch like the smart guy I am. # Related Issue(s) delta-io#1755 prior review and information # Documentation https://delta.io/blog/2023-07-27-delta-lake-table-features/ https://github.com/delta-io/delta/blob/master/PROTOCOL.md#table-features --------- Co-authored-by: Stephen Carman <stephen.carman@databricks.com>
Description
Added initial support for parsing table features out and updating table state with the information for them. Also be aware the table features are only informational right now, they are not used or enforced anywhere currently.
Related Issue(s)
None I am aware of.
Documentation
https://delta.io/blog/2023-07-27-delta-lake-table-features/
https://github.com/delta-io/delta/blob/master/PROTOCOL.md#table-features