-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Date to Timestamp cast #5140
Date to Timestamp cast #5140
Conversation
@liukun4515 @crepererum @alamb please check this one. The downside for now is additional cast created sometimes for timestamp, to convert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great -- thank you @comphead
cc @waitingkuo (we are getting quite sophisticated in DataFusion for dates/times)
@@ -1742,5 +1741,55 @@ async fn test_ts_dt_binary_ops() -> Result<()> { | |||
|
|||
assert_batches_eq!(expected, &results); | |||
|
|||
//test cast path timestamp date using literals | |||
let sql = "select '2000-01-01'::timestamp >= '2000-01-01'::date"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to add these tests to sqllogictest if you wanted -- https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/tests/sqllogictests/test_files/timestamps.slt
Not required but I find the sqllogictest framework much easier to work with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alamb I will create a separate ticket to move all timestamp tests from timestamp.rs to sqllogictest, all that can be moved of course
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much -- that would be great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #5165
Timestamp(TimeUnit::Nanosecond, None) => { | ||
matches!(type_from, Null | Timestamp(_, None)) | ||
Timestamp(TimeUnit::Nanosecond, _) => { | ||
matches!(type_from, Null | Timestamp(_, _) | Date32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 (I have a draft PR in the works to support coercing from utf8
as well #5117)
break; | ||
} | ||
} | ||
assert_eq!(res, Some("Projection: CAST(now() AS Timestamp(Nanosecond, None)) = CAST(currentdate() AS Timestamp(Nanosecond, None))\n EmptyRelation".to_string())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
cc @crepererum and @liukun4515 who filed the tickets this closes |
I think this is correct (and needed) if the timestamps in question have different timezones. It should not really effect runtime performance as I expect both to be evaluated at plan time (and we can add a constant folding rule if it is not) |
I plan to merge this PR tomorrow unless there are other comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @comphead
left the comments bellow
I think we could consider adding this coercion as well as it works in postgrseql (perhaps in another tickets)
(Utf8, Date)
postgrseql
willy=# select '2000-01-01T00:00:00' = '2000-01-01'::date;
?column?
----------
t
(1 row)
datafusion
❯ select '2000-01-01T00:00:00' = '2000-01-01'::date;
ArrowError(CastError("Cannot cast string '2000-01-01T00:00:00' to value of Date32 type"))
(Timestamp(_, tz), Utf8) | (Utf8, Timestamp(_, tz)) => { | ||
Some(Timestamp(TimeUnit::Nanosecond, tz.clone())) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i tried this, but it didn't work
❯ select '2000-01-01T00:00:00'::timestamp::timestamptz = '2000-01-01T00:00:00';
Internal("The type of Timestamp(Nanosecond, Some(\"+00:00\")) Eq Utf8 of binary physical should be same")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need another cast to add Utf <-> Timestamp #5117 probably addresses this cast. If not I'll create another ticket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after #5117 this still fails. I will file a follow on ticket
❯ select '2000-01-01T00:00:00'::timestamp::timestamptz = '2000-01-01T00:00:00';
Internal("The type of Timestamp(Nanosecond, Some("+00:00")) Eq Utf8 of binary physical should be same")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #5164
(Timestamp(_, None), Date32) | (Date32, Timestamp(_, None)) => { | ||
Some(Timestamp(TimeUnit::Nanosecond, None)) | ||
} | ||
(Timestamp(_, _tz), Date32) | (Date32, Timestamp(_, _tz)) => { | ||
Some(Timestamp(TimeUnit::Nanosecond, None)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean that we convert Timestamptz to Timestamp and then compare?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats correct, we lose TZ for now, we need some extra efforts to make arrow-rs work with TimestampTZ,planning to do so just after this PR merged.
I agree -- I can do so. |
Benchmark runs are scheduled for baseline = f0c0958 and contender = e41c4df. e41c4df is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Filed #5164 |
Which issue does this PR close?
Closes #4761
Closes #4644 .
Rationale for this change
Introduce casting between dates and timestamps datatypes. Currently DF just panics on such kind of queries
What changes are included in this PR?
This PR includes planner changes to support date to timestamp cast.
Are these changes tested?
Yes. Tests enclosed
Are there any user-facing changes?
No