Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

df.Truncate now fails when arguments have different tzinfo #29856

Open
JrtPec opened this issue Nov 26, 2019 · 4 comments
Open

df.Truncate now fails when arguments have different tzinfo #29856

JrtPec opened this issue Nov 26, 2019 · 4 comments
Labels
Bug Timezones Timezone data dtype

Comments

@JrtPec
Copy link

JrtPec commented Nov 26, 2019

Code Sample

df = pd.DataFrame()
date1 = pd.Timestamp('20190101', tz='Europe/Brussels')
date2 = pd.Timestamp('20190201T00:00:00+01:00')
df.truncate(before=date1, after=date2)
File "/python3.7/site-packages/pandas/core/indexes/base.py", line 5244, in slice_locs
    raise ValueError("Both dates must have the " "same UTC offset")
ValueError: Both dates must have the same UTC offset

Problem description

In #25263, these lines were added (line 4881):

if not tz_compare(ts_start.tzinfo, ts_end.tzinfo):
    raise ValueError("Both dates must have the same UTC offset"
  1. The timestamps in the example do have the same UTC offset. The difference is that one has an actual timezone, while the other has a fixed offset. So a simple comparison of tzinfo is insufficient here.
  2. It is unclear to my why you shouldn't be able to truncate with timestamps with different offsets!?

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Darwin
OS-release : 18.6.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 0.25.3
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.2.1
setuptools : 28.8.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.3.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : 4.7.1
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.3.3
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : None
sqlalchemy : 1.3.3
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None

@MarcoGorelli
Copy link
Member

Europe/Brussels and FixedOffset(60) aren't the same though - the former does daylight saving:

>>> pytz.timezone('Europe/Brussels').localize(dt.datetime(2002, 9, 27, 6, 0, 0))                                                                                                                                         
datetime.datetime(2002, 9, 27, 6, 0, tzinfo=<DstTzInfo 'Europe/Brussels' CEST+2:00:00 DST>)

>>> pytz.FixedOffset(60).localize(dt.datetime(2002, 9, 27, 6, 0, 0))                                                                                                                                                     
datetime.datetime(2002, 9, 27, 6, 0, tzinfo=pytz.FixedOffset(60))

@JrtPec
Copy link
Author

JrtPec commented Jan 9, 2020

Your comment is correct.
Yet, in my example, the UTC Offsets of both timestamps are identical. The error says they aren't.

My current workaround is this:

# df = df.truncate(before=before, after=after) ERROR
df = df.truncate(before=before)
df = df.truncate(after=after)

So I'm still clueless as to why the oneliner that used to work in all previous versions of Pandas now suddenly doesn't.

@jbrockmendel jbrockmendel added Bug Timezones Timezone data dtype labels Jun 5, 2020
@1kastner
Copy link

1kastner commented Aug 28, 2020

@JrtPec check #16785 and therein referred to for a lengthy discussion on why. It is technically very difficult and the previous behavior was unpredictable (pandas was even sometimes showing a wrong behavior) because sometimes the timezone information accidentially got ignored.

In conclusion, it is a feature, not a bug ;-)

@akhorshidi
Copy link

1. The timestamps in the example do have the same UTC offset. The difference is that one has an actual timezone, while the other has a fixed offset. So a simple comparison of `tzinfo` is insufficient here.

I've faced the same issue but with the DateTime objects:

start_date = datetime(start_date , tzinfo=timezone.utc)
end_date = datetime(end_date, tzinfo=pytz.utc)

and my workaround was to change the value of tzinfo argument inside the first datetime object from timezone.utc to pyzt.utc !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

5 participants