Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-38738: [C++] Check variadic buffer counts in bounds #38740

Merged
merged 3 commits into from
Nov 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions cpp/CMakePresets.json
Original file line number Diff line number Diff line change
Expand Up @@ -430,6 +430,21 @@
],
"displayName": "Benchmarking build with with everything enabled",
"cacheVariables": {}
},
{
"name": "fuzzing",
"inherits": "base",
"displayName": "Debug build with IPC and Parquet fuzzing targets",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "Debug",
"CMAKE_C_COMPILER": "clang",
"CMAKE_CXX_COMPILER": "clang++",
"ARROW_USE_ASAN": "ON",
"ARROW_USE_UBSAN": "ON",
"ARROW_IPC": "ON",
"ARROW_PARQUET": "ON",
"ARROW_FUZZING": "ON"
}
}
]
}
13 changes: 9 additions & 4 deletions cpp/src/arrow/ipc/reader.cc
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,12 @@ class ArrayLoader {
if (i >= static_cast<int>(variadic_counts->size())) {
return Status::IOError("variadic_count_index out of range.");
}
return static_cast<size_t>(variadic_counts->Get(i));
int64_t count = variadic_counts->Get(i);
if (count < 0 || count > std::numeric_limits<int32_t>::max()) {
return Status::IOError(
"variadic_count must be representable as a positive int32_t, got ", count, ".");
}
return static_cast<size_t>(count);
}

Status GetFieldMetadata(int field_index, ArrayData* out) {
Expand Down Expand Up @@ -372,10 +377,10 @@ class ArrayLoader {
RETURN_NOT_OK(LoadCommon(type.id()));
RETURN_NOT_OK(GetBuffer(buffer_index_++, &out_->buffers[1]));

ARROW_ASSIGN_OR_RAISE(auto character_buffer_count,
ARROW_ASSIGN_OR_RAISE(auto data_buffer_count,
GetVariadicCount(variadic_count_index_++));
out_->buffers.resize(character_buffer_count + 2);
for (size_t i = 0; i < character_buffer_count; ++i) {
out_->buffers.resize(data_buffer_count + 2);
for (size_t i = 0; i < data_buffer_count; ++i) {
RETURN_NOT_OK(GetBuffer(buffer_index_++, &out_->buffers[i + 2]));
}
return Status::OK();
Expand Down
24 changes: 16 additions & 8 deletions docs/source/developers/cpp/fuzzing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,9 @@ areas ingesting potentially invalid or malicious data.
Fuzz Targets and Utilities
==========================

By passing the ``-DARROW_FUZZING=ON`` CMake option, you will build
the fuzz targets corresponding to the aforementioned Arrow features, as well
as additional related utilities.
By passing the ``-DARROW_FUZZING=ON`` CMake option (or equivalently, using
the ``fuzzing`` preset), you will build the fuzz targets corresponding to
the aforementioned Arrow features, as well as additional related utilities.

Generating the seed corpus
--------------------------
Expand Down Expand Up @@ -85,11 +85,7 @@ various sanitizer checks enabled.

.. code-block::

$ cmake .. -GNinja \
-DCMAKE_BUILD_TYPE=Debug \
-DARROW_USE_ASAN=on \
-DARROW_USE_UBSAN=on \
-DARROW_FUZZING=on
$ cmake .. --preset=fuzzing

Then, assuming you have downloaded the crashing data file (let's call it
``testcase-arrow-ipc-file-fuzz-123465``), you can reproduce the crash
Expand All @@ -101,3 +97,15 @@ by running the affected fuzz target on that file:

(you may want to run that command under a debugger so as to inspect the
program state more closely)

Using conda
-----------

The fuzzing executables must be compiled with clang and linked to libraries
which provide a fuzzing runtime. If you are using conda to provide your
dependencies, you may need to install these before building the fuzz targets:

.. code-block::

$ conda install clang clangxx compiler-rt
$ cmake .. --preset=fuzzing
Loading