-
Notifications
You must be signed in to change notification settings - Fork 794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monthly data and ordinal encoding create an "off by one" error #1027
Comments
Looks like something to do with Vega-Lite. I've opened an issue there. |
Per response in vega/vega-lite#4044, t looks like the issue is that Javascript parses dates in UTC. You can fix this here by using |
I'm not sure I totally get it. If I submit "2018-06-01," how is that parsed by VegaLite? What would the Python datetime object come back as? |
Edit: I'm wrong here – see new comment below The solution is to parse the month and year in UTC. I agree that this is not very satisfactory, but it sounds like it's pretty deeply baked into the vega/vega-lite stack. |
I just learned something about ISO.
…On Thu, Jul 19, 2018, 11:30 AM Jake Vanderplas ***@***.***> wrote:
My understanding is that it is parsed (in UTC) as 00:00 on 2018-06-01,
which is then interpreted (in ISO) as midnight on 2018-05-31, leading to
the issue you see.
The solution is to parse the month and year in UTC.
I agree that this is not very satisfactory, but it sounds like it's pretty
deeply baked into the vega/vega-lite stack.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1027 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAnCU2-ggCsqC133sX8pjZLA4DNn9fQks5uINBggaJpZM4VSC5E>
.
|
OK, so it turns out I was completely wrong on the reason for this issue when I speculated above. The issue is that the timestamps are being parsed as if they're UTC, and then displayed in local time (i.e. compensating for timezone). Because the west coast is 8 hours behind london, this shifts the time to the previous month for the dates you're using. If you were east of London when running this code, you wouldn't see this issue 😄. So the best solution is to make sure dates are both parsed and displayed as local time, so no time zone correction is required. Looking at the Vega-Lite docs on UTC time, it looks like (and this seems entirely crazy to me) the way you make sure dates are parsed as local time is to not use ISO format. Altair serializes datetime data in ISO format by default, so timezone corrections will always be applied. But if you change the serialization to be non-ISO compliant, you can make things work in as expected: df['date2'] = df['date'].dt.strftime('%b %d %Y') # non-ISO serialization
alt.Chart(df).mark_bar().encode(
x=alt.X("date2:O", timeUnit="yearmonth", axis=alt.Axis(format="%b %y")),
y="pct_change_rounded:Q"
) This format-dependent parsing of dates in Vega is more than a bit surprising to me, and I hope that it can be addressed upstream. But if not, I'd propose we change the way we serialize dates in Altair so that they will be parsed the same way they are displayed, without any implicit time-zone conversion. |
We'd just need to modify this line to use a non-ISO date format that is correctly parsed by the vega stack: |
Update: thanks to some input from @domoritz I understand this a bit better. It looks like if we want to ensure dates are parsed as local times, we need to either use unix timestamps or fully-qualified ISO-8601 dates. This is not a characteristic of Vega or Vega-Lite, but of Javascript itself. For example:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/parse So if we replace |
I'm still not sure I totally get this. But that is almost always the case when it comes to time zones! ;) If you make a patch and want me to test my code again, I'm happy to help. |
I uncovered this "bug" during some data exploration for development of a open-source library tracking the U.S. Consumer Price Index.
The source data is monthly, with the values linked with the first date of each month.
My aim was to format the dates in the x-axis labels in the same manner as the government's sample chart.
I tried to do that with
timeUnit
and theformat
option toaxis
.Here's what I got:
Look closely and you can see that the latest value, June 1, is rendered as May.
What's up with that?
The text was updated successfully, but these errors were encountered: