Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timezone object of Timestamp vs DatetimeIndex is different for same time / timezone #17572

Closed
jorisvandenbossche opened this issue Sep 18, 2017 · 9 comments
Labels

Comments

@jorisvandenbossche
Copy link
Member

Code Sample, a copy-pastable example if possible

When localizing a Timestamp vs timeseries/index, the timezones look similar

In [21]: ts = pd.Timestamp('2012-01-01').tz_localize('US/Eastern')

In [22]: dtidx = pd.to_datetime(['2012-01-01']).tz_localize('US/Eastern')

In [23]: ts
Out[23]: Timestamp('2012-01-01 00:00:00-0500', tz='US/Eastern')

In [24]: dtidx
Out[24]: DatetimeIndex(['2012-01-01 00:00:00-05:00'], dtype='datetime64[ns, US/Eastern]', freq=None)

In [25]: dtidx[0]
Out[25]: Timestamp('2012-01-01 00:00:00-0500', tz='US/Eastern')

but under the hood the tz attributes of the DatetimeIndex is actually different as the tz attribute of the actual Timestamps (even when accessing a single element of the DatetimeIndex as a Timestamp):

In [26]: ts.tz
Out[26]: <DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>

In [27]: dtidx.tz
Out[27]: <DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>   <-- note the difference here

In [28]: dtidx[0].tz
Out[28]: <DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>

In [29]: ts.tz == dtidx.tz
Out[29]: False

In [30]: ts.tz.zone
Out[30]: 'US/Eastern'

In [31]: dtidx.tz.zone
Out[31]: 'US/Eastern'
@jreback
Copy link
Contributor

jreback commented Sep 18, 2017

of course, these couldn't possibly be the same as in an index you have multiple tzs, while a timestamp has only 1. now sure what you are reporting here.

@jorisvandenbossche
Copy link
Member Author

And I don't understand what you are saying :-)

in an index you have multiple tzs

a DatetimeIndex has a single tz attribute. And even then, why would the tz attribute of a DatetimeIndex be different than the tz attribute of the timestamps inside it ?

@jorisvandenbossche
Copy link
Member Author

Small different example, where I create two equal DatetimeIndex instances in two different ways, and ending up with a different tz object:

# create date_range, then localize
In [51]: dtidx1 = pd.date_range("2012-01-01", periods=3).tz_localize('US/Eastern')

# create same date_range from localized start timestamp
In [52]: dtidx2 = pd.date_range(pd.Timestamp("2012-01-01", tz='US/Eastern'), periods=3)

In [53]: dtidx1.equals(dtidx2)
Out[53]: True

In [54]: dtidx1.tz
Out[54]: <DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>

In [56]: dtidx2.tz
Out[56]: <DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>

In [57]: dtidx1.tz == dtidx2.tz
Out[57]: False

I am not sure whether is an important issue. I just noticed it, and it seemed odd.

@jreback
Copy link
Contributor

jreback commented Sep 18, 2017

tzobjects themselves don't often compare equal because they are tied to a specific datetime that is the localizer; but the strings do

In [2]: dtidx1.tz
Out[2]: <DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>

In [3]: dtidx2.tz
Out[3]: <DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>

In [4]: str(dtidx1.tz) == str(dtidx2.tz)
Out[4]: True

In [7]: dtidx1.equals(dtidx2)
Out[7]: True

this is just a fact of life w.r.t. timezone objects. you have entered the rabbit hole.

I think this is a duplicate issue as well, pls have a look.

@mroeschke
Copy link
Member

I was confused about this as well at one point (comment), and this SO post echoes @jreback explanation about why the pytz tzobjects reps are different.

While this should probably be explained better in the pytz docs, it might be worth adding a small clarification to the Pandas timeseries docs.

@jorisvandenbossche
Copy link
Member Author

Thanks for that link!
Yeah, I had in the meantime also noticed that the same difference can be seen between an initialized pytz timezone, and the timezone of a datetime localized with it.
So I would say it is more a pytz issue.

Not sure whether this is worth adding a note about it to the pandas docs.

@gfyoung gfyoung added the Timezones Timezone data dtype label Sep 18, 2017
@jreback jreback added this to the Next Major Release milestone Sep 20, 2017
@jreback
Copy link
Contributor

jreback commented Sep 20, 2017

I think this could be worth a small .. note:: section in timeseries/timezones

@AdamShamlian
Copy link
Contributor

I'll take this quickie. Anything to make handling time smoother. I'm thinking something along the lines of mentioning .tz's are not obligated to be the same from Timestamp to "equivalent" DatetimeIndex's and linking both to SO and the other issue?

@jreback
Copy link
Contributor

jreback commented Dec 1, 2017

closing in favor of #18595 which is the underlying reason/cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants