Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Document timestamp with/without cast behavior #5826

Merged
merged 5 commits into from
Jun 3, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 31, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

The behavior of casting timestamps to/from timezones is quite subtle and I spent quite some time testing them out in the context of apache/datafusion#10602

Thus I thought it would be good to document this behavior in the arrow crate itself so I don't have to do that next time (and hopefully) others can benefit from it as well.

What changes are included in this PR?

Document, with examples, what happens when one casts timestamp with timezone to/from timestamp without timezone

Are there any user-facing changes?

Documentation. No changes to code

@github-actions github-actions bot added the arrow Changes to the arrow crate label May 31, 2024
/// However, note that when casting from a timestamp with timezone BACK to a
/// timestamp without timezone the cast kernel does not adjust the values.
///
/// Thus round trip casting a timestamp without timezone to a timestamp with
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior that round trip casting the timestamp CHANGES the underlying timestamp I think caused a lot of confusion (at least for me) in the context of apache/datafusion#10602

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed #5827 to discuss changing the behavior

@alamb
Copy link
Contributor Author

alamb commented May 31, 2024

CI integration is failing due to #5815

Copy link
Contributor

@Abdullahsab3 Abdullahsab3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for documenting this. Made the behavior much clearer. Minor remark

arrow-cast/src/cast/mod.rs Outdated Show resolved Hide resolved
//! use arrow_array::types::Float64Type;
//! use arrow_array::cast::AsArray;
//!
//! # use arrow_array::*;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this just makes the existing example less verbose

///
/// When casting from a timestamp without timezone to a timestamp with
/// timezone, the cast kernel treats the underlying timestamp values as being in
/// UTC and adjusts them to the provided timezone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct, it interprets the timestamp as being in the destination timezone and then adjusts the value to UTC as required.

From the docs on DataType

One possibility is to assume that the original timestamp values are relative to the epoch of the timezone being set; timestamp values should then adjusted to the Unix epoch (for example, changing the timezone from empty to “Europe/Paris” would require converting the timestamp values from “Europe/Paris” to “UTC”, which seems counter-intuitive but is nevertheless correct).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Than you -- fixed

arrow-cast/src/cast/mod.rs Outdated Show resolved Hide resolved
@tustvold tustvold merged commit 57c4748 into apache:master Jun 3, 2024
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants