-
Notifications
You must be signed in to change notification settings - Fork 79
Arrow Velox zero copy conversion
Wenlei Xie edited this page Nov 24, 2021
·
1 revision
As pointed out in https://arrow.apache.org/docs/format/CDataInterface.html
For non-C/C++ languages and runtimes, it should be almost as easy to translate the C definitions into the corresponding C FFI declarations.
PyArrow CFFI: https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/python/pyarrow/cffi.py
CFFI test in pyarrow. See
- https://github.com/apache/arrow/blob/5ead37593472c42f61c76396dde7dcb8954bde70/python/pyarrow/tests/test_cffi.py#L162-L163
- https://github.com/apache/arrow/blob/5ead37593472c42f61c76396dde7dcb8954bde70/python/pyarrow/tests/test_cffi.py#L176
- Note ptr_array is essentially an ArrowArray cast into uintptr_t (represents as Python int)
CFFI doc: https://cffi.readthedocs.io/en/latest/
Example notebook: https://gist.github.com/wesm/d48908018c4b7a0d9789a31d10caf525
(take IColumn.to_arrow()
as an example )
- (C++) Calls the Velox Vector ->
ArrowArray*
conversion - (C++) Converts the
ArrowArray*
into auintptr_t
- (C++ -> Python) Returns C++
uintptr_t
to Python int - (Python) Calling
pa.Array._import_from_c(ptr_array, typ)
to get the pyarrow ArrowArray