Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add to_arrow_device function to cudf interop using nanoarrow #15047

Merged

Conversation

zeroshade
Copy link
Contributor

Description

Introduce new to_arrow_device and to_arrow_schema functions to utilize the ArrowDeviceArray structure for zero-copy passing of libcudf::table.

Add nanoarrow as a vendored lib and a script to update it.

Initial step towards addressing #14926

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link

copy-pr-bot bot commented Feb 13, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue labels Feb 13, 2024
Copy link

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nanoarrow notes to take or leave!

cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
Comment on lines 534 to 536
ArrowArrayInitFromType(tmp.get(), NANOARROW_TYPE_STRUCT);

ArrowArrayAllocateChildren(tmp.get(), table.num_columns());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may also want to use ArrowArrayInitFromSchema(), which will initialize the whole tree of arrays up front. This might simplify some later code and lets ArrowArrayFinish() with a validation level do its thing. I might be slower (I haven't checked).

cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
@vyasr
Copy link
Contributor

vyasr commented Feb 14, 2024

JFYI I'll be on vacation for the next week or so. Feel free to proceed without my input, just letting y'all know that I will catch up when I get back.

@GregoryKimball
Copy link
Contributor

@zeroshade thank you for building out this proposal. We will review and provide feedback on the draft. 🙏

@zeroshade
Copy link
Contributor Author

Would it be possible to get some more reviews here? I've been attempting to keep this up to date but I haven't dug too much further into building this until I get some confirmation that everyone is okay with this approach and that it is worthwhile to continue in this pattern versus refactoring or changing the approach.

@GregoryKimball
Copy link
Contributor

Thank you @zeroshade for keeping this up to date. At the moment we are juggling GTC-related activities so please excuse the delay in reviews.

One piece of feedback from the team is that we would prefer to pull in the nanoarrow headers at build time rather than adding the file to the source tree.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the slow response. I came back from vacation to a long backlog and too much to juggle, and I didn't want to give this PR a cursory review. I spent a decent chunk of my review reading the nanoarrow API docs to make sure I understood the code here so I may have missed some things in the code, but overall I think this PR is a great start! The approach looks good and I think we should move forward.

I have a couple of bigger picture questions:

  1. I'm wondering how we want this to interoperate with the existing arrow interop in libcudf. In order to get libcudf working more directly off the C Data interface, do you think it would be possible to rewrite our existing to_arrow as basically (pseudocode) cudaMemcpy(to_arrow_device(table), DtoH)?
  2. Possibly also a question for @paleolimbot: is there any plan for working with scalars? Scalars are a bit of thorn in our side right now, especially in the Python/C++ interop layer because of how many edge cases run into issues with different pyarrow APIs. Do you foresee a way to rewrite our scalar interop in terms of the C Data interface/nanoarrow functions coming at some point? I would absolutely settle for a solution that boils down to "python scalar->python array->C++ array->C++ scalar" and in reverse (since I don't imagine there are imminent plans to add scalars directly to the interface), but a big part of why I wrote Enable direct ingestion and production of Arrow scalars #14121 was because even that first step "python scalar->python array" was lossy and sometimes raised errors in pyarrow. I know there are some nascent nanoarrow Python bindings, so maybe that's what we want to move towards in the long term.

As per Greg's comment, I think it would be good to get the nanoarrow files at build time. We typically do that with CPM.cmake via rapids-cmake. I'm not sure how familiar you are with that approach, but there are some good examples in rapids-cmake. The rmm and gtest are good starting points that don't have too many additional complexities baked in. That said, if that CMake is unfamiliar to you don't worry about it, we can change the vendoring strategies at a later point once the PR is more mature.

cpp/include/cudf/interop.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/interop.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/interop.hpp Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
cpp/src/interop/to_arrow_device.cu Outdated Show resolved Hide resolved
@vyasr
Copy link
Contributor

vyasr commented Mar 27, 2024

Yup unrelated. We pushed new CI images containing a conda update that removed a CLI option we relied on. We've temporarily downgraded and rebuilt our images to unblock things, but also the PR David linked is the long-term fix so that we can upgrade again.

Hmm, if we can no longer pass rvalues to the API, what actually is the benefit of using the span over a const vector by reference?

That's a fair question. When the contained type is itself easily brace-initialized like in this case I believe you can get the same rvalue-like behavior by adding a second set of braces, which I believe will invoke the initializer list constructor of a std::array and then convert that to a span (the array's lifetime should persist for the length of the function call). Here's a sample. In general the benefit of a span is that you can use any container you want instead of being constrained to use a vector, but I assume you know that and you're asking for this specific API whether it's realistically important for users to be able to use other containers, and whether or not the typical usage pattern will just be via rvalues. I don't have a great answer for this API; it may not really be that important in practice and becomes just a matter of C++ hygiene.

@vyasr
Copy link
Contributor

vyasr commented Mar 27, 2024

/ok to test

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the hard work here @zeroshade! I'm approving assuming you don't plan to touch from_arrow_device in this PR (which I'm totally on board with, I think this PR is large and complicated enough as is).

@zeroshade
Copy link
Contributor Author

@vyasr yea, I plan on trying to tackle from_arrow_device in a follow-up PR

@vyasr
Copy link
Contributor

vyasr commented Mar 28, 2024

Just a heads-up, NVIDIA employees in the US are off today and tomorrow so you probably won't see the next review until Monday.

cpp/include/cudf/interop.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/interop.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/interop.hpp Outdated Show resolved Hide resolved
cpp/tests/interop/to_arrow_device_test.cpp Outdated Show resolved Hide resolved
zeroshade and others added 6 commits April 1, 2024 10:38
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
@vyasr
Copy link
Contributor

vyasr commented Apr 1, 2024

/ok to test

@zeroshade zeroshade requested a review from davidwendt April 1, 2024 18:05
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
@davidwendt
Copy link
Contributor

/ok to test

@vyasr
Copy link
Contributor

vyasr commented Apr 1, 2024

/merge

@rapids-bot rapids-bot bot merged commit 268996a into rapidsai:branch-24.06 Apr 1, 2024
73 checks passed
@vyasr
Copy link
Contributor

vyasr commented Apr 1, 2024

Thanks @zeroshade!

@jlowe
Copy link
Contributor

jlowe commented Apr 2, 2024

This appears to have broken libcudf static library builds. See #15434

@zeroshade zeroshade deleted the arrow-device-array-nanoarrow branch April 2, 2024 15:55
rapids-bot bot pushed a commit that referenced this pull request Apr 5, 2024
Fixes debug build failures resulting from changes from #15047. Here are some of the errors reported by the compiler:
```
Building CXX object tests/CMakeFiles/INTEROP_TEST.dir/interop/to_arrow_device_test.cpp.o
FAILED: tests/CMakeFiles/INTEROP_TEST.dir/interop/to_arrow_device_test.cpp.o 
/usr/local/bin/g++ -DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DNANOARROW_DEBUG -DSPDLOG_ACTIVE_LEVEL=SPDLOG_LEVEL_INFO -DSPDLOG_FMT_EXTERNAL -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_DISABLE_ABI_NAMESPACE -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTHRUST_IGNORE_ABI_NAMESPACE_ERROR -I/cudf/cpp -I/cudf/cpp/src -I/cudf/cpp/build/_deps/dlpack-src/include -I/cudf/cpp/build/_deps/jitify-src -I/cudf/cpp/include -I/cudf/cpp/build/include -I/cudf/cpp/build/_deps/cccl-src/thrust/thrust/cmake/../.. -I/cudf/cpp/build/_deps/cccl-src/libcudacxx/lib/cmake/libcudacxx/../../../include -I/cudf/cpp/build/_deps/cccl-src/cub/cub/cmake/../.. -I/cudf/cpp/build/_deps/nvtx3-src/c/include -I/cudf/cpp/build/_deps/nanoarrow-src/src -I/cudf/cpp/build/_deps/nanoarrow-build/generated -fvisibility-inlines-hidden -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /conda/envs/rapids/include -fdiagnostics-color=always  -I/conda/envs/rapids/targets/x86_64-linux/include  -L/conda/envs/rapids/targets/x86_64-linux/lib -L/conda/envs/rapids/targets/x86_64-linux/lib/stubs -g -std=gnu++17 -fPIE -Wall -Werror -Wno-unknown-pragmas -Wno-error=deprecated-declarations -pthread -MD -MT tests/CMakeFiles/INTEROP_TEST.dir/interop/to_arrow_device_test.cpp.o -MF tests/CMakeFiles/INTEROP_TEST.dir/interop/to_arrow_device_test.cpp.o.d -o tests/CMakeFiles/INTEROP_TEST.dir/interop/to_arrow_device_test.cpp.o -c /cudf/cpp/tests/interop/to_arrow_device_test.cpp
In file included from /cudf/cpp/tests/interop/to_arrow_device_test.cpp:17:
/cudf/cpp/tests/interop/nanoarrow_utils.hpp: In function 'void populate_list_from_col(ArrowArray*, cudf::lists_column_view)':
/cudf/cpp/tests/interop/nanoarrow_utils.hpp:220:26: error: ignoring return value of 'ArrowErrorCode ArrowBufferSetAllocator(ArrowBuffer*, ArrowBufferAllocator)' declared with attribute 'warn_unused_result' [-Werror=unused-result]
  220 |   ArrowBufferSetAllocator(ArrowArrayBuffer(arr, 0), noop_alloc);
      |   ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/cudf/cpp/tests/interop/nanoarrow_utils.hpp:224:26: error: ignoring return value of 'ArrowErrorCode ArrowBufferSetAllocator(ArrowBuffer*, ArrowBufferAllocator)' declared with attribute 'warn_unused_result' [-Werror=unused-result]
  224 |   ArrowBufferSetAllocator(ArrowArrayBuffer(arr, 1), noop_alloc);
      |   ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/cudf/cpp/tests/interop/to_arrow_device_test.cpp: In member function 'void BaseArrowFixture::compare_arrays(const ArrowSchema*, const ArrowArray*, const ArrowArray*)':
/cudf/cpp/tests/interop/to_arrow_device_test.cpp:268:24: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaViewInit(ArrowSchemaView*, const ArrowSchema*, ArrowError*)' declared with attribute 'warn_unused_result' [-Werror=unused-result]
  268 |     ArrowSchemaViewInit(&schema_view, schema, nullptr);
/cudf/cpp/tests/interop/to_arrow_device_test.cpp: In member function 'virtual void ToArrowDeviceTest_DateTimeTable_Test::TestBody()':
/cudf/cpp/tests/interop/to_arrow_device_test.cpp:353:27: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetTypeStruct(ArrowSchema*, int64_t)' declared with attribute 'warn_unused_result' [-Werror=unused-result]
  353 |   ArrowSchemaSetTypeStruct(expected_schema.get(), 1);
/cudf/cpp/tests/interop/to_arrow_device_test.cpp:355:29: error: ignoring return value of 'ArrowErrorCode cudfArrowSchemaSetTypeDateTime(ArrowSchema*, ArrowType, ArrowTimeUnit, const char*)' declared with attribute 'warn_unused_result' [-Werror=unused-result]

(many more)
```
Warning are turned into errors so the build fails.
Fix simply adds the `NANOARROW_THROW_NOT_OK` to the offending calls.

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Mike Wilson (https://github.com/hyperbolic2346)

URL: #15463
rapids-bot bot pushed a commit that referenced this pull request Apr 23, 2024
)

Adding a corresponding `from_arrow_device` function following up from #15047. This continues the work towards addressing #14926.

Authors:
  - Matt Topol (https://github.com/zeroshade)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - David Wendt (https://github.com/davidwendt)

URL: #15458
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants