Skip to content

Arrow Velox zero copy conversion

Wenlei Xie edited this page Nov 24, 2021 · 1 revision

Background

As pointed out in https://arrow.apache.org/docs/format/CDataInterface.html

For non-C/C++ languages and runtimes, it should be almost as easy to translate the C definitions into the corresponding C FFI declarations.

PyArrow CFFI: https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/python/pyarrow/cffi.py

Example CFFI Code in PyArrow

CFFI test in pyarrow. See

CFFI doc: https://cffi.readthedocs.io/en/latest/

Example notebook: https://gist.github.com/wesm/d48908018c4b7a0d9789a31d10caf525

Example Workflow

(take IColumn.to_arrow() as an example )

  • (C++) Calls the Velox Vector -> ArrowArray* conversion
  • (C++) Converts the ArrowArray* into a uintptr_t
  • (C++ -> Python) Returns C++ uintptr_t to Python int
  • (Python) Calling pa.Array._import_from_c(ptr_array, typ) to get the pyarrow ArrowArray
Clone this wiki locally