-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Push down predicate involving timestamp_tz in Delta Lake #19874
Push down predicate involving timestamp_tz in Delta Lake #19874
Conversation
86b1abf
to
71003a0
Compare
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/ConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/ConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/UtcConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/UtcConstraintExtractor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/sql/query/QueryAssertions.java
Outdated
Show resolved
Hide resolved
lib/trino-plugin-toolkit/src/main/java/io/trino/plugin/base/filter/UtcConstraintExtractor.java
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestUtcConstraintExtractor.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestConstraintExtractor.java
Outdated
Show resolved
Hide resolved
Rename ConstraintExtractor to UtcConstraintExtractor
25e9cee
to
869594b
Compare
* {@link TimestampWithTimeZoneType}. Such unwrap would not be monotonic. Within Iceberg, we know | ||
* Test equivalent of {@code io.trino.sql.planner.iterative.rule.UnwrapCastInComparison} for {@link TimestampWithTimeZoneType}. | ||
* {@code UnwrapCastInComparison} handles {@link DateType} and {@link TimestampType}, but cannot handle | ||
* {@link TimestampWithTimeZoneType}. Such unwrap would not be monotonic. If we know |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* {@link TimestampWithTimeZoneType}. Such unwrap would not be monotonic. If we know | |
* {@link TimestampWithTimeZoneType}. Such unwrap would not be monotonic. We know |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for the other occurences
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the "if" was removed, it would suggest that this is always true that the time zone is UTC. It can be only guaranteed per connector, and the optimization only applies to those connectors.
long startOfDateUtcEpochMillis = someDate.atStartOfDay().toEpochSecond(UTC) * MILLISECONDS_PER_SECOND; | ||
long startOfDateUtc = timestampTzMillisFromEpochMillis(startOfDateUtcEpochMillis); | ||
long startOfNextDateUtc = timestampTzMillisFromEpochMillis(startOfDateUtcEpochMillis + MILLISECONDS_PER_DAY); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduce code redundancy with the method testExtractTimestampTzMicrosDateComparison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When writing this, I experimented with extracting common code for similar test methods. I eventually settled for more repetition, as it improves readability a lot.
.matches("VALUES " + | ||
"(3, TIMESTAMP '2023-11-21 07:19:00.000 UTC')," + | ||
"(4, TIMESTAMP '2005-09-10 13:00:00.000 UTC')"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: This assertion can be added as well.
assertThat(query("SELECT * FROM " + tableName + " WHERE date_trunc('hour', part) >= TIMESTAMP '2005-09-10 00:00:00.001 +07:00'"))
.isFullyPushedDown();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a similar test case, thanks!
869594b
to
3964149
Compare
@kasiafi and others... do we need to add a pushdown section to the performance section in the Delta Lake connector docs .. some here https://trino.io/docs/current/connector/delta-lake.html#performance Similar to how we do it for JDBC connectors like https://trino.io/docs/current/connector/postgresql.html#pushdown |
I tend to say no - the improvemenst covered by this PR (which btw exist for Iceberg connector as well) are goodies that should just be there - they are "obvious" from user perspective. While we were working on these topics on Iceberg, we've published a blog post about them https://trino.io/blog/2023/04/11/date-predicates.html |
💯 user docs is not the place to brag about various optimization techniques we are using. |
Okay .. thanks @findepi and @findinpath |
Description
Certain expressions involving TIMESTAMP WITH TIME ZONE can be optimized in Delta Lake connector based on the fact that columns of type TIMESTAMP WITH TIME ZONE are represented using the UTC time zone:
date_trunc('...', timeztamp_tz_column) <comparison> constant
CAST(timestamp_tz_column AS DATE) <comparison> constant
year(timestamp_tz_column) <comparison> constant
The optimization is based on similar optimization in the Iceberg connector.
Fixes #18664
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: