Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] Implement bucket conversion for Hive splits #23028

Merged
merged 2 commits into from
Jun 19, 2024

Conversation

Yuhta
Copy link
Contributor

@Yuhta Yuhta commented Jun 18, 2024

When the bucket count of a table changes over time, there can be legitimate cases that multiple buckets exist in the same file. In such cases the query planner should set bucket conversion for these splits and in Velox we use extra filter to get only the rows corresponding to the bucket number requested.

@Yuhta Yuhta marked this pull request as ready for review June 18, 2024 14:35
@Yuhta Yuhta requested a review from a team as a code owner June 18, 2024 14:35
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Yuhta. Overall looks very good minus some documentation comments.

@aditi-pandit
Copy link
Contributor

@Yuhta : We should have a Release note for this PR I feel. Please can you add details in the PR description.

xiaoxmeng
xiaoxmeng previously approved these changes Jun 18, 2024
Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yuhta thanks for the fix!

aditi-pandit
aditi-pandit previously approved these changes Jun 18, 2024
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Yuhta

When the bucket count of a table changes over time, there can be legitimate
cases that multiple buckets exist in the same file.  In such cases the query
planner should set bucket conversion for these splits and in Velox we use extra
filter to get only the rows corresponding to the bucket number requested.
@xiaoxmeng xiaoxmeng merged commit 0168e16 into prestodb:master Jun 19, 2024
58 of 59 checks passed
@tdcmeehan tdcmeehan mentioned this pull request Aug 23, 2024
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants