-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-11417: Add integration files for buffer compression #56
Conversation
I'm not sure what the naming scheme should be. I chose "1.0.0" for the format version, but is it the Arrow library version that generated the files instead? (in which case it should be "2.0.0") |
The JSON files are refused by the Java Arrow integration test:
It looks like the problem is that it contains an empty "children" field for primitive batches, and that displeases the Java reader (which for some unknown reason uses a low-level token-by-token parsing technique). |
Ok, I edited the JSON files by hand to remove the offending field and now Java manages to read them (cringe), though it fails to instantiate the compression:
|
Strange, I wonder why main integration tests aren't broken for the JSON issue. Regarding versioning, I used the library version that generated them. It seems like a bug that Java can't even read uncompressed buffers? Or maybe the java error message is bad. Either way something for Java contributors to figure out. Thanks for adding these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming rename to use library version that generated them.
The files were generated using PyArrow 2.0.0. Then they had to be edited by hand to make them compatible with the Java JSON reader (because of ARROW-11483).
c00e0f1
to
c4f2e00
Compare
Because the JSON files that are used in the integration tests are generated by the Python datagen harness (in |
I renamed to "2.0.0-compression", will merge. |
@pitrou Hi, how do you generated the compression file? |
@liukun4515 The compression file was generated using Arrow C++ IIRC. |
No description provided.