Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make table protocol version / table features easily accessible #1657

Closed
MrPowers opened this issue Sep 21, 2023 · 6 comments
Closed

Make table protocol version / table features easily accessible #1657

MrPowers opened this issue Sep 21, 2023 · 6 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers

Comments

@MrPowers
Copy link
Collaborator

Description

There is a deltalake.table.ProtocolVersion interface here, but I don't fully understand it. I guess this is for changing the protocol versions of a table?

Is there a way to fetch the reader_version & writer_version for a table?

Is there a way to fetch all the table features that are enabled for a table?

@MrPowers MrPowers added the enhancement New feature or request label Sep 21, 2023
@wjones127
Copy link
Collaborator

That's the return value of the DeltaTable.protocol() method.

def protocol(self) -> ProtocolVersions:

For some reason this method isn't documented.

@wjones127 wjones127 added the documentation Improvements or additions to documentation label Sep 21, 2023
@wjones127
Copy link
Collaborator

Oh it's probably just because it doesn't have a docstring. Anyone want to make a PR to add one?

@wjones127 wjones127 added the good first issue Good for newcomers label Sep 21, 2023
@MrPowers
Copy link
Collaborator Author

@wjones127 - getting this documented would be awesome!

What do you think about another table_features method that returns the table features that are enabled for a table? If the protocol() method returns 3,7, then the user still won't know what table features are enabled.

@wjones127
Copy link
Collaborator

What do you think about another table_features method that returns the table features that are enabled for a table? If the protocol() method returns 3,7, then the user still won't know what table features are enabled.

Perhaps this could be another field on ProtocolVersions, since it is part of the versioning info? Something like:

class ProtocolVersions(NamedTuple):
    min_reader_version: int
    min_writer_version: int
    table_features: List[str]

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Sep 25, 2023

What do you think about another table_features method that returns the table features that are enabled for a table? If the protocol() method returns 3,7, then the user still won't know what table features are enabled.

Perhaps this could be another field on ProtocolVersions, since it is part of the versioning info? Something like:

class ProtocolVersions(NamedTuple):
    min_reader_version: int
    min_writer_version: int
    table_features: List[str]

Currently we can access them I saw, with dt.metadata().configuration

@Jan-Schweizer
Copy link

Jan-Schweizer commented Oct 30, 2023

I'd like to try myself on this issue.
The task is to extend the pub struct Protocol here like so:

pub struct Protocol {
    pub min_reader_version: i32,
    pub min_writer_version: i32,
    pub reader_features: Option<Vec<String>>,
    pub writer_features: Option<Vec<String>>,
}

Also, the DeltaTableState should be extended with reader_features: Option<Vec<String>>, and writer_features: Option<Vec<String>>,.

Is this understanding correct?

Edit: I've seen that this issue will be taken care of with this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants