PARQUET-861: Document INT96 timestamps #49

xhochy · 2017-02-03T18:40:45Z

No description provided.

gszadovszky · 2017-02-06T10:03:18Z

Thanks for adding this. LGTM.

zivanfi · 2017-02-06T13:47:22Z

I find this addition valuable, but a little confusing as well at the same time, at least in its current form. As far as I can tell, the Impala timestamp is not a logical type at all in the sense that it is not something that can be specified in the Parquet schema - it's just an INT96 value. I think this should be explicitly mentioned, because all other logical types on this page are actual logical types of Parquet itself.

xhochy · 2017-02-06T13:55:13Z

Currently the only known use "in the wild" for INT96 is this sort of timestamps. INT96 isn't mentioned with the physical data types and I though putting it into this section would be the most helpful as this is where all timestamps are mentioned.

I will at least add a small section mentioning that INT96 is no logical but a physical type.

zivanfi · 2017-02-06T14:03:25Z

Thanks, that will clarify it in my opinion.

zivanfi · 2017-02-06T16:43:42Z

LGTM, thanks!

lekv · 2017-02-08T14:27:09Z

LogicalTypes.md

+(first 8 byte) and the Julian day (last 4 bytes). No timezone is attached to this value.
+To convert the timestamp into nanoseconds since the Unix epoch, 00:00:00.000000
+on 1 January 1970, the following formula can be used:
+`(julian_day - 2440588) * (86400 * 1000 * 1000 * 1000) + nanoseconds`.


The formula looks good to me. Technically, Julian days refer to the noon of a day, so everywhere else you'd find "2440587.5" instead. However, boost uses only the day part of that, so for Impala your formula looks correct.

We might consider changing it to seconds since the epoch, since that seems more commonly used: (julian_day - 2440588) * 86400 + nanoseconds / (1000 * 1000 * 1000)

lekv · 2017-02-08T14:27:12Z

LogicalTypes.md

@@ -144,6 +144,18 @@ example, there is no requirement that a large number of days should be
 expressed as a mix of months and days because there is not a constant
 conversion from days to months.

+### INT96 timestamps (also called IMPALA_TIMESTAMP)
+
+_(deprecated)_ Timestamps saved as an `int96` are made up of the nanoseconds in the day


+@zivanfi, who is working on making timestamps more uniform across several query engines.

Hive also writes Parquet files in this format, although Impala seems to have used it first. iIt would be good if we could include documentation for 12 byte binary timestamps in either #46 or here. Otherwise we'd be deprecating a widely used format with no clear alternative.

agreed, let's add a paragraph here about where it is used.
The deprecation JIRA should link to corresponding Impala and Hive JIRAs.
I think we're signing up for being able to read that type forever for compatibility with existing files but provide a better alternative for writing timestamp.

Just a suggestion from a random passer by... I think it might be nice to document the byte order of these timestamps as well, as they appear to be little endian whereas other primitive ints in parquet are big endian. At least that seems to be the issue that's lead us to finding this PR in the first place :)

julienledem

LGTM

julienledem · 2017-02-08T18:17:23Z

LogicalTypes.md

@@ -144,6 +144,18 @@ example, there is no requirement that a large number of days should be
 expressed as a mix of months and days because there is not a constant
 conversion from days to months.

+### INT96 timestamps (also called IMPALA_TIMESTAMP)
+
+_(deprecated)_ Timestamps saved as an `int96` are made up of the nanoseconds in the day


agreed, let's add a paragraph here about where it is used.
The deprecation JIRA should link to corresponding Impala and Hive JIRAs.
I think we're signing up for being able to read that type forever for compatibility with existing files but provide a better alternative for writing timestamp.

timarmstrong · 2017-02-08T18:50:01Z

LogicalTypes.md

+### INT96 timestamps (also called IMPALA_TIMESTAMP)
+
+_(deprecated)_ Timestamps saved as an `int96` are made up of the nanoseconds in the day
+(first 8 byte) and the Julian day (last 4 bytes). No timezone is attached to this value.


I think we need to specify the signedness - the first 8 bytes are a uint64 and the last 4 bytes are a int32 I believe

I remember reading somewhere that the signedness is explicitly specified as undefined, because neither the date nor the time component can be so large as to reach the most significant bit, but unfortunately I can't find where I read this. However, this can only be true if we don't allow dates before the epoch of Julian date, because that would make the date part negative.

rdblue · 2017-02-14T17:34:41Z

What is the purpose of adding this to the Parquet spec? I don't expect anyone to attempt to implement it, considering that Impala has been planning on moving to the int64 timestamp for years. Has that changed?

Adding a type to the spec carries with it some expectation that it will be supported by implementations, and I'd rather focus effort on moving to the date/time types already in the spec instead of legitimizing this.

xhochy · 2017-02-14T18:29:26Z

I've actually implemented the read path for this recently in parquet-cpp and given that e.g. Spark actually doesn't read TIMESTAMP_MILLIS but rather throws typeNotImplemented , ( see https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L161 ) we might move on and also implement writing INT96 timestamps in parquet-cpp.

julienledem · 2017-02-15T00:26:57Z

@rdblue as I understand Hive is also using int96?
@xhochy: what's the reason for implementing the write path on the cpp side?

I think we should have JIRAs in each project that writes int96s (Impala, Hive) and have them move to TIMESTAMP_MILLIS or . We can support reading int96 for backward compatibility and even possibly write them with a flag turned on if people really need this.

Impala/Hive people: I recall the int96 was because of nanosecond timestamps? Do wee need a TIMESTAMP_NANOS logical type backed by a fixed_length_bytearray physical type?

xhochy · 2017-02-15T10:26:20Z

I would implement the write path only to support passing Timestamps to systems that only support timestamps as INT96 such as Spark does.

lekv · 2017-02-15T17:27:02Z

I don't know the historical reasons why Impala implements timestamps the way it does, but we use boost internally to process them and from comments in the source it sounds like we went with that and just swapped the members (compared to boost) to save space. I think Hive followed Impala.

Right now we store timestamps in memory in the same format that we write to parquet. IMPALA-4825 will swap the fields to align the storage layout with Kudu. However there are currently no plans to switch Impala's internal representation, so we would need to add conversion code to and from a new Parquet format, which may impact performance.

Should we go ahead and open JIRAs for Impala and Hive to discuss the feasibility of that?

rdblue · 2017-02-16T17:11:54Z

The history of the int96 timestamp is that Impala had a 12-byte internal representation that was written directly to int96 because the field was 12 bytes and not used for anything else -- it didn't require discussion in the community like adding a logical type.

The type has nanosecond precision because boost did; it wasn't based on any use case, so we probably don't need a TIMESTAMP_NANOS type. It is also a bad for users that this type persists because it uses 4 extra bytes per value before encoding, and can't take advantage of integer compression like delta encoding.

@xhochy, I think the solution is to add support for int64 timestamp-micros to Spark, not to add more support for this. I'll add support in Spark and can help with the other engines as well.

julienledem · 2017-04-04T14:04:39Z

@rdblue I agree that we want to deprecate int96 but it is useful to have written down somewhere what the existing files with int96 in them actually contain. We should add a comment that min/max stats are always invalid for int96.

julienledem · 2017-06-08T17:29:17Z

@rdblue unfortunately the int96 has already leaked in other projects (drill, spark, hive), so read support for this will have to stay.
I fully agree that we should mark it deprecated for write and use the preferred int64 timestamp-micros alternative. However the existing format should be documented including its shortcomings (deprecated, no sort order, etc)
Possibly just add a more strongly comment at the beginning of the int96 description?

rdblue · 2017-06-08T21:55:24Z

I'd rather not do anything to encourage further use of this convention, given how many bugs and problems there have been with it. Databases that have implementations won't benefit, and it would only allow more people to implement it. That said, I won't block it if this is something that everyone else thinks is reasonable.

…#include scrubbing This is the completion of work I started in PARQUET-442. This also resolves PARQUET-277 as no boost headers are included in the public API anymore. I've done some scrubbing of #includes using Google's Clang-based include-what-you-use tool. PARQUET-522 can also be resolved when this is merged. Author: Wes McKinney <wes@cloudera.com> Closes apache#49 from wesm/PARQUET-446 and squashes the following commits: e805a0c [Wes McKinney] Use int64_t for scanner batch sizes 503b1c1 [Wes McKinney] Fix mixed-up include guard names 4c02d2b [Wes McKinney] Refactor monolithic encodings/encodings.h 9e28fc3 [Wes McKinney] Finished IWYU path. Some imported impala code left unchanged for now 6d4af8e [Wes McKinney] Some initial IWYU 5be40d6 [Wes McKinney] Remove outdated TODO 2e39062 [Wes McKinney] Remove any boost #include dependencies 07059ca [Wes McKinney] Remove serialized-page.* files, move serialized-page-test to parquet/file 9458b36 [Wes McKinney] Add more headers to parquet.h public API b4b0412 [Wes McKinney] Remove Thrift compiled headers from public API and general use outside of deserialized-related internal headers and code paths. Add unit test to enforce this

shevek · 2018-03-04T07:08:02Z

A note to anyone else who has just lost a working day dealing with the edicts of dead popes and eventually discovers this ticket: The data format is LITTLE ENDIAN, at least in the samples which have been brightening my day. You decode it using JodaTime and Guava thusly:

long nanos = Longs.fromBytes(data[7], data[6], data[5], data[4], data[3], data[2], data[1], data[0]);
long julianday = Ints.fromBytes(data[11], data[10], data[9], data[8]);
long epochmillis = DateTimeUtils.fromJulianDay(julianday) + TimeUnit.NANOSECONDS.toMillis(nanos);

Given that I'm fairly sure all these machines are Intel, this may be a hardware-dependent statement. I wish you luck.

zivanfi · 2018-03-05T11:01:51Z

It's even more complicated than that. INT96 was originally not supposed to store timestamps, but int96 numbers, hence the name. When storing int96 numbers, the data must be written using little endian byte order, but the corresponding statistics are handled by thrift which uses a big endian byte order. So the values in the data and in the statistics use a different byte order.

The bytes of a timestamp are stored in the reverse order, thereby the data becomes big endian and the statistics become little endian. However, if a read path already handles the endianness of int96 values as if they were numbers, your code will get bytes as if they were a number. Because the timestamp byte order is the reverse of number byte order, you will need to handle those as if they were in a little endian order, since that leads to reversing them back.

Long story short: This is a nightmare.

Closes apache#49

wierzba3 · 2019-02-12T20:05:25Z

It's even more complicated than that. INT96 was originally not supposed to store timestamps, but int96 numbers, hence the name. When storing int96 numbers, the data must be written using little endian byte order, but the corresponding statistics are handled by thrift which uses a big endian byte order. So the values in the data and in the statistics use a different byte order.

The bytes of a timestamp are stored in the reverse order, thereby the data becomes big endian and the statistics become little endian. However, if a read path already handles the endianness of int96 values as if they were numbers, your code will get bytes as if they were a number. Because the timestamp byte order is the reverse of number byte order, you will need to handle those as if they were in a little endian order, since that leads to reversing them back.

Long story short: This is a nightmare.

@zivanfi -- Do you know how I can go about writing this INT96 timestamp type to a parquet file? I cannot find any specification that explains exactly how to encode the value.

Here is a SO question I just posted that provides much more detail...
https://stackoverflow.com/questions/54657496/how-to-write-timestamp-logical-type-int96-to-parquet-using-parquetwriter?noredirect=1#comment96106843_54657496

(I am well aware of the negative sentiment towards this encoding type, but unfortunately I must use this encoding because it is required by our system.)

zivanfi · 2019-02-13T08:36:53Z

@wierzba3 I answered your question on SO.

rdblue · 2019-02-13T17:33:42Z

@zivanfi, @wierzba3, I noted this on the SO issue: the reason why I think adding this documentation is because people will write INT96 timestamps. Those are broken and should not be used. It should remain difficult to write them to discourage it.

Parquet has documented and supported timestamps you should use instead.

shevek · 2019-02-13T18:21:41Z

Marking it as deprecated is all very well, but those of us who have to read legacy data written by older systems STILL NEED DOCUMENTATION.

rdblue · 2019-02-13T19:37:59Z

@shevek, the question was about how to write data into this unsupported format, not read it.

shevek · 2019-02-24T01:06:24Z

@rdblue The title of the ticket is "Document INT96 timestamps", and the ticket has been closed without such documentation. In fact, the argument has been that it should not BE documented, and that's totally unhelpful to me. What would happen if I were to raise another ticket titled "please document int96 timestamps?" because I still have to read legacy data written in that format.

It seems fairly evident that the deprecation argument aside, this PR should be merged.

timarmstrong · 2019-02-24T21:34:46Z

It's ultimately up to Parquet committers to decide what's best for the project and community.

I think it would be helpful to have this documented, but we can get creative with how we might discourage people from using it. E.g. put the documentation in a separate file BrokenLogicalTypes.md, add more aggressive disclaimers "Writing this timestamp will anger the Parquet gods and guarantee all of your timestamps are off by many hours", "WARNING: This logical type contains chemicals known to the State of California to cause cancer and birth defects or other reproductive harm.", or whatever.

rdblue · 2019-02-24T21:48:40Z

To me, the question is: are we better off as a community with documentation for this, or without it?

I think we are better off without it.

As we can see from James' request, people are interested in building new code to write these values. Given that there are so many problems with them, the intent must be to work with other systems that already use INT96 timestamps, but don't work with the supported timestamp types. Expanding the set of systems that use INT96 timestamps -- read or write -- instead of adding support for the recommended ones to the ones that currently use INT96 only perpetuates the use of the bad type. That's bad for the community and bad for end users.

According to CDH 6.1 docs, Impala is still broken and cannot use the supported timestamp types:

Parquet consideration: int96 encoded Parquet timestamps are supported in Impala. int64 timestamps will be supported in a future release.

Considering the amount of pain and effort people spend to make INT96 timestamps work, I think time would be better spent contributing a fix there, so everyone can finally move off of INT96 timestamps.

zivanfi · 2019-02-25T12:00:20Z

@rdblue We have already added INT64 timestamp read support to Impala and write support is under review as well:

But even with these new developments, we still can't write interoperable INT64 timestamps until parquet-mr 1.11.0 get released, because Java clients can't read back non-UTC-normalized timestamps (nor nanosec-precision ones) using older parquet-mr versions. We already have changes in review for this functionality as well though:

timarmstrong · 2019-02-25T19:33:59Z

Yeah Impala has been moving on it - definitely the way it was done initially has created a mess - but unfortunately it will take a while to completely move all users off int96 Impala timestamps, given upgrade cycles, policies about backwards-incompatible changes, etc.

### What changes were proposed in this pull request? 1. Add the SQL config `spark.sql.legacy.parquet.int96RebaseModeInWrite` to control timestamps rebasing in saving them as INT96. It supports the same set of values as `spark.sql.legacy.parquet.datetimeRebaseModeInWrite` but the default value is `LEGACY` to preserve backward compatibility with Spark <= 3.0. 2. Write the metadata key `org.apache.spark.int96NoRebase` to parquet files if the files are saved with `spark.sql.legacy.parquet.int96RebaseModeInWrite` isn't set to `LEGACY`. 3. Add the SQL config `spark.sql.legacy.parquet.datetimeRebaseModeInRead` to control loading INT96 timestamps when parquet metadata doesn't have enough info (the `org.apache.spark.int96NoRebase` tag) about parquet writer - either INT96 was written by Proleptic Gregorian system or some Julian one. 4. Modified Vectorized and Parquet-mr Readers to support loading/saving INT96 timestamps w/o rebasing depending on SQL config and the metadata tag: - **No rebasing** in testing when the SQL config `spark.test.forceNoRebase` is set to `true` - **No rebasing** if parquet metadata contains the tag `org.apache.spark.int96NoRebase`. This is the case when parquet files are saved by Spark >= 3.1 with `spark.sql.legacy.parquet.datetimeRebaseModeInWrite` is set to `CORRECTED`, or saved by other systems with the tag `org.apache.spark.int96NoRebase`. - **With rebasing** if parquet files saved by Spark (any versions) without the metadata tag `org.apache.spark.int96NoRebase`. - Rebasing depend on the SQL config `spark.sql.legacy.parquet.datetimeRebaseModeInRead` if there are no metadata tags `org.apache.spark.version` and `org.apache.spark.int96NoRebase`. New SQL configs are added instead of re-using existing `spark.sql.legacy.parquet.datetimeRebaseModeInWrite` and `spark.sql.legacy.parquet.datetimeRebaseModeInRead` because of: - To allow users have different modes for INT96 and for TIMESTAMP_MICROS (MILLIS). For example, users might want to save INT96 as LEGACY but TIMESTAMP_MICROS as CORRECTED. - To have different modes for INT96 and DATE in load (or in save). - To be backward compatible with Spark 2.4. For now, `spark.sql.legacy.parquet.datetimeRebaseModeInWrite/Read` are set to `EXCEPTION` by default. ### Why are the changes needed? 1. Parquet spec says that INT96 must be stored as Julian days (see apache/parquet-format#49). This doesn't mean that a reader ( or a writer) is based on the Julian calendar. So, rebasing from Proleptic Gregorian to Julian calendar can be not needed. 2. Rebasing from/to Julian calendar can loose information because dates in one calendar don't exist in another one. Like 1582-10-04..1582-10-15 exist in Proleptic Gregorian calendar but not in the hybrid calendar (Julian + Gregorian), and visa versa, Julian date 1000-02-29 doesn't exist in Proleptic Gregorian calendar. We should allow users to save timestamps without loosing such dates (rebasing shifts such dates to the next valid date). 3. It would also make Spark compatible with other systems such as Impala and newer versions of Hive that write proleptic Gregorian based INT96 timestamps. ### Does this PR introduce _any_ user-facing change? It can when `spark.sql.legacy.parquet.int96RebaseModeInWrite` is set non-default value `LEGACY`. ### How was this patch tested? - Added a test to check the metadata key `org.apache.spark.int96NoRebase` - By `ParquetIOSuite` Closes #30056 from MaxGekk/parquet-rebase-int96. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

PARQUET-861: Document INT96 timestamps

550a282

Add note that there is no logical type annotation present for INT96

cb47277

xhochy mentioned this pull request Feb 8, 2017

PARQUET-686: Add Order to store the order used for min/max stats. #46

Closed

lekv reviewed Feb 8, 2017

View reviewed changes

julienledem reviewed Feb 8, 2017

View reviewed changes

timarmstrong reviewed Feb 8, 2017

View reviewed changes

martindurant mentioned this pull request Mar 6, 2017

Timestamp metadata and Spark dask/fastparquet#93

Closed

zivanfi approved these changes Jun 8, 2017

View reviewed changes

lekv pushed a commit to lekv/parquet-format that referenced this pull request Mar 13, 2018

PARQUET-323: Mark INT96 as deprecated

2667e08

Closes apache#49

lekv mentioned this pull request Mar 13, 2018

PARQUET-323: Mark INT96 as deprecated #86

Merged

lekv closed this in #86 Mar 22, 2018

fbocse mentioned this pull request Nov 16, 2018

Cannot convert unknown primitive type: required int96 timestamp Netflix/iceberg#103

Closed

sachouche mentioned this pull request Jan 31, 2019

DRILL-7018: Fixed Parquet buffer overflow when reading timestamp column apache/drill#1630

Closed

nevi-me mentioned this pull request Mar 14, 2019

ARROW-4466: [Rust] [DataFusion] Add support for Parquet data source apache/arrow#3851

Closed

tanmaykm mentioned this pull request Apr 15, 2020

Timestamp columns are not supported. JuliaIO/Parquet.jl#49

Closed

gustavoatt mentioned this pull request Jul 8, 2020

Read support for parquet int96 timestamps apache/iceberg#1184

Merged

2 tasks

MaxGekk mentioned this pull request Oct 17, 2020

[SPARK-33160][SQL] Allow saving/loading INT96 in parquet w/o rebasing apache/spark#30056

Closed

chelseajonesr mentioned this pull request May 11, 2023

Support reading int96 as timestamp segmentio/parquet-go#506

Open

flipp5b mentioned this pull request Jan 23, 2024

[RFC] Refactor timestamp codecs mjakubowski84/parquet4s#333

Open

asfimport mentioned this pull request Jun 23, 2024

Document INT96 timestamps #320

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PARQUET-861: Document INT96 timestamps #49

PARQUET-861: Document INT96 timestamps #49

xhochy commented Feb 3, 2017

gszadovszky commented Feb 6, 2017

zivanfi commented Feb 6, 2017

xhochy commented Feb 6, 2017

zivanfi commented Feb 6, 2017

zivanfi commented Feb 6, 2017

lekv Feb 8, 2017

lekv Feb 8, 2017

julienledem Feb 8, 2017

psFried Feb 13, 2017

julienledem left a comment

julienledem Feb 8, 2017

timarmstrong Feb 8, 2017

zivanfi Feb 9, 2017

rdblue commented Feb 14, 2017

xhochy commented Feb 14, 2017

julienledem commented Feb 15, 2017

xhochy commented Feb 15, 2017

lekv commented Feb 15, 2017

rdblue commented Feb 16, 2017

julienledem commented Apr 4, 2017

julienledem commented Jun 8, 2017

rdblue commented Jun 8, 2017

shevek commented Mar 4, 2018

zivanfi commented Mar 5, 2018

wierzba3 commented Feb 12, 2019 •

edited

Loading

zivanfi commented Feb 13, 2019

rdblue commented Feb 13, 2019

shevek commented Feb 13, 2019

rdblue commented Feb 13, 2019

shevek commented Feb 24, 2019

timarmstrong commented Feb 24, 2019

rdblue commented Feb 24, 2019

zivanfi commented Feb 25, 2019 •

edited

Loading

timarmstrong commented Feb 25, 2019

PARQUET-861: Document INT96 timestamps #49

PARQUET-861: Document INT96 timestamps #49

Conversation

xhochy commented Feb 3, 2017

gszadovszky commented Feb 6, 2017

zivanfi commented Feb 6, 2017

xhochy commented Feb 6, 2017

zivanfi commented Feb 6, 2017

zivanfi commented Feb 6, 2017

lekv Feb 8, 2017

Choose a reason for hiding this comment

lekv Feb 8, 2017

Choose a reason for hiding this comment

julienledem Feb 8, 2017

Choose a reason for hiding this comment

psFried Feb 13, 2017

Choose a reason for hiding this comment

julienledem left a comment

Choose a reason for hiding this comment

julienledem Feb 8, 2017

Choose a reason for hiding this comment

timarmstrong Feb 8, 2017

Choose a reason for hiding this comment

zivanfi Feb 9, 2017

Choose a reason for hiding this comment

rdblue commented Feb 14, 2017

xhochy commented Feb 14, 2017

julienledem commented Feb 15, 2017

xhochy commented Feb 15, 2017

lekv commented Feb 15, 2017

rdblue commented Feb 16, 2017

julienledem commented Apr 4, 2017

julienledem commented Jun 8, 2017

rdblue commented Jun 8, 2017

shevek commented Mar 4, 2018

zivanfi commented Mar 5, 2018

wierzba3 commented Feb 12, 2019 • edited Loading

zivanfi commented Feb 13, 2019

rdblue commented Feb 13, 2019

shevek commented Feb 13, 2019

rdblue commented Feb 13, 2019

shevek commented Feb 24, 2019

timarmstrong commented Feb 24, 2019

rdblue commented Feb 24, 2019

zivanfi commented Feb 25, 2019 • edited Loading

timarmstrong commented Feb 25, 2019

wierzba3 commented Feb 12, 2019 •

edited

Loading

zivanfi commented Feb 25, 2019 •

edited

Loading