-
-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Logical Timestamp #521
Conversation
@@ -49,6 +49,8 @@ public class EndToEndTypeTest : TestBase { | |||
["dateTime local kind"] = (new DataField<DateTime>("dateTime unknown kind"), new DateTime(2020, 06, 10, 11, 12, 13, DateTimeKind.Local)), | |||
["impala date local kind"] = (new DateTimeDataField("dateImpala unknown kind", DateTimeFormat.Impala), new DateTime(2020, 06, 10, 11, 12, 13, DateTimeKind.Local)), | |||
["dateDateAndTime local kind"] = (new DateTimeDataField("dateDateAndTime unknown kind", DateTimeFormat.DateAndTime), new DateTime(2020, 06, 10, 11, 12, 13, DateTimeKind.Local)), | |||
["timestamp utc kind"] = (new DateTimeDataField("timestamp utc kind", DateTimeFormat.Timestamp, true), new DateTime(2020, 06, 10, 11, 12, 13, DateTimeKind.Utc)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need more tests for different resolutions
byte[] raw = BitConverter.GetBytes(unixTime); | ||
destination.Write(raw, 0, raw.Length); | ||
} else if (tse.LogicalType.TIMESTAMP.Unit.NANOS is not null) { | ||
long unixTime = element.ToUtc().ToUnixNanoseconds(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't have been needed but some of the embedded test files seem to have a timeunit of nanosecond
} | ||
#if NET7_0_OR_GREATER | ||
else if(format == DateTimeFormat.DateAndTimeMicros) { | ||
Unit = DateTimeTimeUnit.Micros; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the corect unit for Impala?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Precision is Nanos, Impala uses INT96
https://docs.cloudera.com/cdw-runtime/cloud/impala-sql-reference/topics/impala-timestamp.html
When can we expect that feature to be released? |
Hey sorry for the delay, I'll try to wrap up commits for release next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thank you!
Thanks for merging this @aloneguid, Can I create a docs pr? |
Of course, thanks! |
for(int i = 0; i < longsRead; i++) { | ||
if(tse.LogicalType.TIMESTAMP.Unit.MILLIS is not null) { | ||
DateTime dt = longs[i].AsUnixMillisecondsInDateTime(); | ||
dt = DateTime.SpecifyKind(dt, tse.LogicalType.TIMESTAMP.IsAdjustedToUTC ? DateTimeKind.Utc : DateTimeKind.Local); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting a DateTime
to DateTimeKind.Local
contains time zone information, in that on conversion to DateTimeOffset
the conversion assumes the computer's timezone. Instead DateTimeKind.Unspecified
, which would prevent a conversion to DateTimeOffset
, would probably be more suitable.
Also, for this change in general, how does it affect existing data? Evidentially all my data has been written with a DateTime
in a local/unknown timezone (not necessarily the same timezone). Will reading it back now produce different results?
If I want to remain compatible with past files do I now need to set isAdjustedToUtc = false
? Will it produce identical results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngbrown maybe we should change it to unspecified?
https://github.com/dotnet/runtime/blob/v8.0.8/src/libraries/System.Private.CoreLib/src/System/DateTime.cs#L135
The Epoch is in UTC.
We should be able to create a test that checks how this behaves
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a PR to test
#553
Looks like the dates will come back as UTC
Resolves #492
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp