Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundness: Creating GenericStringArray from ArrayData does not perform bound checks on reading offsets #776

Closed
jorgecarleitao opened this issue Sep 14, 2021 · 2 comments

Comments

@jorgecarleitao
Copy link
Member

jorgecarleitao commented Sep 14, 2021

Equivalent to #773 for the GenericStringArray.

use arrow::array::*;
use arrow::buffer::*;
use arrow::datatypes::*;

fn main() {
    let data = ArrayData::new(
        DataType::Utf8,
        10,
        None,
        None,
        0,
        vec![
            Buffer::from_slice_ref(&[0i32, 1]),
            Buffer::from_slice_ref(&[0b00100100u8]),
        ],
        vec![],
    );
    let a = StringArray::from(data);
    let b = a.value(1);
    println!("{}", b);
}
error: Undefined Behavior: using uninitialized data, but this operation requires initialized memory
   --> /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/arith.rs:213:1
    |
213 | sub_impl! { usize u8 u16 u32 u64 u128 isize i8 i16 i32 i64 i128 f32 f64 }
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ using uninitialized data, but this operation requires initialized memory
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
            
    = note: inside `<i32 as std::ops::Sub>::sub` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/arith.rs:206:45
    = note: inside `arrow::array::GenericStringArray::<i32>::value` at /home/azureuser/projects/arrow-rs/arrow/src/array/array_string.rs:121:17
note: inside `main` at arrow/examples/unsafe.rs:19:13
   --> arrow/examples/unsafe.rs:19:13
    |
19  |     let b = a.value(1);
    |             ^^^^^^^^^^
    = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
    = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125:18
    = note: inside closure at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:63:18
    = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:259:13
    = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:401:40
    = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:365:19
    = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:434:14
    = note: inside closure at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:45:48
    = note: inside `std::panicking::r#try::do_call::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:401:40
    = note: inside `std::panicking::r#try::<isize, [closure@std::rt::lang_start_internal::{closure#2}]>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:365:19
    = note: inside `std::panic::catch_unwind::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:434:14
    = note: inside `std::rt::lang_start_internal` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:45:20
    = note: inside `std::rt::lang_start::<()>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:62:5
    = note: this error originates in the macro `sub_impl` (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to previous error
@jorgecarleitao jorgecarleitao changed the title Soundness: GenericUtf8Array does not perform bound checks on reading offsets Soundness: GenericStringArray does not perform bound checks on reading offsets Sep 29, 2021
@alamb alamb changed the title Soundness: GenericStringArray does not perform bound checks on reading offsets Soundness: Creating GenericStringArray from ArrayData does not perform bound checks on reading offsets Sep 29, 2021
@alamb
Copy link
Contributor

alamb commented Sep 29, 2021

Updating title of the ticket to make it clear this affects misusing the lower level APIs, as described in https://github.com/apache/arrow-rs/tree/master/arrow#safety

@alamb
Copy link
Contributor

alamb commented Oct 29, 2021

This is a specific case of the general issue described in #817, so closing this one as a duplicate in favor of the more general ticket.

@alamb alamb closed this as completed Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants