-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix json segfaults #12802
BUG: fix json segfaults #12802
Conversation
non-ndarray values are the
|
for tz (that's on 0.18.0), but add as a test for your PR.
|
welp, ok I'll take another look. |
@Komnomnomnom thanks! |
pls update when you can |
46110d9
to
6cb26fe
Compare
The json code assumed in several places that the I added a helper function to handle this a bit better. It initially attempts to use the
In [35]: df
Out[35]:
A
0 2013-01-01 00:00:00-05:00
1 2013-01-02 00:00:00-05:00
2 2013-01-03 00:00:00-05:00
In [36]: df.values
Out[36]:
DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00',
'2013-01-03 00:00:00-05:00'],
dtype='datetime64[ns, US/Eastern]', freq='D')
In [37]: df.get_values()
Out[37]:
DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00',
'2013-01-03 00:00:00-05:00'],
dtype='datetime64[ns, US/Eastern]', freq='D')
In [38]: df._data.blocks[0].values
Out[38]:
DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00',
'2013-01-03 00:00:00-05:00'],
dtype='datetime64[ns, US/Eastern]', freq='D')
In [39]: df._data.blocks[0].get_values()
Out[39]:
DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00',
'2013-01-03 00:00:00-05:00'],
dtype='datetime64[ns, US/Eastern]', freq='D')
In [40]: type(df.values)
Out[40]: pandas.tseries.index.DatetimeIndex
In [41]: df.dtypes
Out[41]:
A datetime64[ns, US/Eastern]
dtype: object
In [42]: type(df._data.blocks[0])
Out[42]: pandas.core.internals.DatetimeTZBlock
In [43]: df._data.blocks[0].dtype
Out[43]: datetime64[ns, US/Eastern] |
6cb26fe
to
fc7d458
Compare
@Komnomnomnom no that is all as expected. Internally the How do you think we should serialize this? In general converting to a ndarray is not a great thing as these are now UTC times (don't be fooled as to the repr, they have been converted).
converting to a list will preserve everything
This converts to string, not sure how useful this is to you (we use this for csv writing).
This is probably the most general thing you can do (and will coerce all things correctly). So if you find a block that doesn' thave
|
lmk when you are ready for review |
@Komnomnomnom hows this coming |
@jreback should have some time this evening |
(on UK time) |
@Komnomnomnom great thanks! |
fc7d458
to
e3e6cf8
Compare
@jreback TZ block is supported now. Note atm the JSON handler always converts to UTC when encoding and this seems to be the case for any time object. It would be nice if it honoured TZ timestamps (see #12997) but I've added some regression tests to ensure this behaviour remains consistent in the meantime. So the behaviour now is to repeatedly call
I just want to give it a run through vbench, otherwise I'm happy with it. Let me know if you have any comments. Am I missing something regarding the TZ behaviour? |
@Komnomnomnom no problem with coercing to UTC, that's what we do anyone when transforming to any non-tz supported. note that this is numpy 1.11, under older ones you would get a differernt DISPLAY (but the actual values are correct).
|
@@ -321,6 +322,9 @@ Deprecations | |||
|
|||
|
|||
|
|||
- Potential segfault in ``DataFrame.to_json`` when serialising ``datetime.time`` (:issue:`11473`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue for the datetime w/tz fixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e3e6cf8
to
af006a4
Compare
@jreback is there CI support for windows? If not is there any chance you could run the JSON tests on your windows box? |
yes, you can sign up for http://www.appveyor.com/. we run this on master, but you can run on a personal (free) accct. I will run now (on my local vm). |
@Komnomnomnom passes! |
side issue. for conformity I was going to rename: ``pandas\io\tests\test_json` to |
or I can do it in a followup. |
Great! np I'll rename it now |
@Komnomnomnom this looks fine. ping me when your travis run is green (no need to wait for master) |
also move the |
@Komnomnomnom looks good. ping when green. |
Some unrelated msgpack and pickle failures (during version comparison?), not sure what's going on there. Otherwise clean: |
@Komnomnomnom thanks. yeah those happen when it doesn't get the tag information, not really sure why it doesnt' always work. |
thanks for the fixes! |
closes #11473
closes #10778
closes #11299
git diff upstream/master | flake8 --diff
This fixes several potential segfaults in the json code:
goto INVALID
) pandas.io.tests.test_json.test_ujson.NumpyJSONTests testOdArray segfaults under python2.7/3.5 in 0.17.0 #11299Fixing #10778 means non-ndarray blocks are now supported (although I think at present
category
is the only one?).Not tested on windows yet. Seeing some unrelated travis failures on my fork (msgpack), is that normal?