Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Add dictionary support in integration test utility (#342)
This PR implements dictionary support in the integration test utility and fixes a few problems identified with integration testing to ensure that it actually works end-to-end (via apache/arrow#39302 ). The changes are: - Batches that contain dictionaries can now be read, written, and validated using integration testing JSON - Fixed an issue in the integration test library (anything other than the first batch previously segfaulted) - Improved const correctness of nanoarrow.hpp (because dictionaries required a `std::unordered_map<>` with a `UniqueSchema` and a few const overloads were missing) - Fixed the nullability of the top-level batch to match Arrow C++ output - Fixed the null count of exported arrays (previously they were all exported as having zero nulls) It can now be tested with `archery` (after checking out apache/arrow#39302 ): ``` export ARROW_CPP_EXE_PATH=/Users/deweydunnington/.r-arrow-dev-build/build/debug export ARROW_NANOARROW_PATH=/path/to/arrow-nanoarrow/build archery integration --with-cpp=true --with-nanoarrow=true --run-c-data ``` The current failures are limited to the remaining unimplemented types (datetime types and decimal). And for future me or anybody who has to/wants to launch a debugger with a segfaulting integration test in VSCode, it can be done with this launch.json: ``` { "type": "lldb", "request": "launch", "name": "Debug Integration Tests", "program": "${workspaceFolder}/.venv/bin/python", "args": ["-m", "archery.cli", "integration", "--with-cpp=true", "--with-nanoarrow=true", "--run-c-data"], "cwd": "${workspaceFolder}", "env": { "ARROW_CPP_EXE_PATH": "/Users/deweydunnington/.r-arrow-dev-build/build/debug", "ARROW_NANOARROW_PATH": "${workspaceFolder}/out/build/user-local" } } ```
- Loading branch information