Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cudf::make_column_from_scalar on list overflow throws OOM not overflow #13833

Closed
revans2 opened this issue Aug 8, 2023 · 0 comments · Fixed by #13841
Closed

[BUG] cudf::make_column_from_scalar on list overflow throws OOM not overflow #13833

revans2 opened this issue Aug 8, 2023 · 0 comments · Fixed by #13841
Assignees
Labels
bug Something isn't working Spark Functionality that helps Spark RAPIDS

Comments

@revans2
Copy link
Contributor

revans2 commented Aug 8, 2023

Describe the bug
#12885 and #12925 indicate that an overflow exception should be thrown, but when I overflow on a LIST instead of a STRING of the same size I get an out of memory error.

  using FCW     = cudf::test::fixed_width_column_wrapper<int8_t>;

  auto s   = cudf::make_list_scalar(FCW({1, 2, 3, 4, 5, 6, 7, 8, 9, 10}));
  auto col = cudf::make_column_from_scalar(*s, 214748365);

The scary part is that the OMM happens because the size calculation was done as an int and overflowed so we tried to allocate a very large negative allocation. If I overflow twice the allocation succeeds, but I end up writing data outside of what was actually allocated.

  using FCW     = cudf::test::fixed_width_column_wrapper<int8_t>;

  auto s   = cudf::make_list_scalar(FCW({1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}));
  auto col = cudf::make_column_from_scalar(*s, 214748365);
@revans2 revans2 added bug Something isn't working Needs Triage Need team to review and classify Spark Functionality that helps Spark RAPIDS labels Aug 8, 2023
@davidwendt davidwendt self-assigned this Aug 9, 2023
rapids-bot bot pushed a commit that referenced this issue Aug 14, 2023
… utility (#13841)

Internal lists functions `make_lists_column_from_scalar` (used by `make_column_from_scalar`) and `generate_list_offsets_and_validities` (used by `concatenate_list_elements`) are updated to use the `make_offsets_child_column` utility to build the offsets from sizes. This utility handles `size_type` overflow when computing an offsets column in a consistent way (i.e. throwing `std::overflow_error` appropriately).

Closes #13833

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Bradley Dice (https://github.com/bdice)
  - Robert (Bobby) Evans (https://github.com/revans2)

URL: #13841
@bdice bdice removed the Needs Triage Need team to review and classify label Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Spark Functionality that helps Spark RAPIDS
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants