Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Date parsing parse integer as year by default? #1681

Open
kanitw opened this issue Mar 9, 2019 · 8 comments
Open

Should Date parsing parse integer as year by default? #1681

kanitw opened this issue Mar 9, 2019 · 8 comments
Labels
discussion For discussing proposed changes

Comments

@kanitw
Copy link
Member

kanitw commented Mar 9, 2019

Date parsing currently use integer as timestamp for parsing by default.

However, a common problem is that people actually store year as integer way more often than timestamps. While users can adjust the parse to "%Y", most people don't know they have to do so.

Thus, I wonder if we should adjust the default (or be smarter about date type inference in Vega).

cc: @jakevdp

@kanitw kanitw added the discussion For discussing proposed changes label Mar 9, 2019
@jheer
Copy link
Member

jheer commented Mar 9, 2019

A couple thoughts:

  1. If we do this, doesn't it preclude the use of timestamps as inputs? (Yes, we could check the value and use timestamps for larger integers, but this seems arbitrary, brittle, and in some cases potentially infuriating.)

  2. Your suggestion would seem to boil down to changing vega-util's toDate method. FWIW, the JS built-in Date.parse maps a single number (either as a JS number, or a string) to the first day of that year. By extension, I believe Vega should do the same for strings like "2019".

  3. Integers will by default be parsed as number values (not Dates) by Vega. So for number input one would already need to specify a Date type explicitly, right? I'm assuming the primary issue is that in VL/Altair one might indicate a temporal type and have numbers treated as dates. Is that right?

  4. Is there a reason this can't be handled at the VL level?

@kanitw
Copy link
Member Author

kanitw commented Mar 10, 2019

Is there a reason this can't be handled at the VL level?

Vega-Lite doesn't have access to the data, so it only knows that the data is temporal, but doesn't know how the data looks like at all.

Integers will by default be parsed as number values (not Dates) by Vega. So for number input one would already need to specify a Date type explicitly, right? I'm assuming the primary issue is that in VL/Altair one might indicate a temporal type and have numbers treated as dates. Is that right?

Yep

Your suggestion would seem to boil down to changing vega-util's toDate method. FWIW, the JS built-in Date.parse maps a single number (either as a JS number, or a string) to the first day of that year. By extension, I believe Vega should do the same for strings like "2019".

Given that we at least parse "2019" as year, it might make sense to just document that integer = timestamp by default, esp. given that a smarter logic can be too brittle and precludes timestamp inputs.

@kanitw kanitw closed this as completed Mar 10, 2019
@jheer
Copy link
Member

jheer commented Mar 10, 2019

Technically this would require a breaking change, so we should probably shelve it for the time being. That said, in the future we could consider updating the date parsing as you suggest here and require a parsing specifier of date:"%Q" for timestamp support (which I don't think is a particularly common need).

In any case, I just tested and d3.timeParse('%Q')(Date.now()) behaves as one would expect.

@kanitw
Copy link
Member Author

kanitw commented Mar 10, 2019

Should we re-open and put it in 6.0 milestone then?

@joelostblom
Copy link
Contributor

joelostblom commented Sep 29, 2023

I do really like the suggestion of parsing four digit integers and strings as years by default when setting the data format to "temporal". As already mentioned above, I think this is a more common date storage format than integers as timestamps. It would be convenient to be able to simply specify that the data type is temporal in VL and have integers rendered as year. Doing that currently yields some odd results and one must recast the column as a string to get the desired behavior:

image

Would you consider re-opening this and implementing this feature in Vega? We could change this on the Altair side of things and recast ints as str automatically when the temporal data type is used, but would like to avoid being inconsistent with VL here is possible.

We're also discussing this in a few other places:

@domoritz
Copy link
Member

reopened and moved into 6.0

@hydrosquall
Copy link
Member

Noting that Date parsing is handled not just by vega-util's toDate as Jeff mentioned above here, but also by the datetime vega-expression interpreter as @djbarnwall found in #3913 (review) .

This is analogous to the thin wrapper around the new Date constructor used in non-interpreter mode.

Whatever solution is chosen, I think we would want to have datetime and toDate behaving similarly, whether in interpreter (vega-expression) mode or not. The source of the current bug DJ found and fixed relates to how the interpreter mode had a different signature.

My inclination is to ask if we want to keep the Date constructor matching the signature of the JS Date API (which seems to be a guideline when naming and implementing other vega expressions).

  • If so, I'd recommend against changing the default, but perhaps share a warning if the years are small (for example, less than human scale time, e.g. year 10000), with a link to a page encouraging you to cast to string and/or some other solution
  • If no and we are willing to make date parsing drift from the signature of the JS date constructor, we should have a migration path for people who were relying on the prev behavior.

My 2c on whether year or timestamp formatting is common: timestamp formatting would be common for machine generated data (querying an API), but I see the popularity of Year format for data coming from human-authored spreadsheets.

Is there some way we could make this step suggested by @kanitw more discoverable at the Altair or VL level so we can preserve the old behavior, while still giving good ergonomics to people who want to provide years as input?

While users can adjust the parse to "%Y", most people don't know they have to do so.

@lsh
Copy link
Member

lsh commented Mar 21, 2025

@domoritz and I talked about punting this from the 6.0 milestone and evaluating it when we also evaluate #4033.

@lsh lsh removed this from the Vega 6.0 milestone Mar 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion For discussing proposed changes
Projects
None yet
Development

No branches or pull requests

6 participants