Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copying inappropriately aligned buffer in ipc reader #2883

Merged
merged 10 commits into from
Oct 16, 2022

Conversation

viirya
Copy link
Member

@viirya viirya commented Oct 16, 2022

Which issue does this PR close?

Closes #2882.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Oct 16, 2022
@tustvold
Copy link
Contributor

I don't think this is correct, we rely on the memory within PrimitiveArray being correctly aligned?

.unwrap();
let array = Int32Array::from(array_data);
assert_eq!(array.len(), 1);
assert_eq!(array.value(0), 0);
Copy link
Contributor

@tustvold tustvold Oct 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now UB, as this violates the safety requirement in PrimitiveArray::values.

I'm not sure why MIRI isn't catching this...

Edit: array.value doesn't call array.values so this test isn't UB. If you add a call to array.values() in this test, MIRI will fail

@viirya
Copy link
Member Author

viirya commented Oct 16, 2022

I don't think this is correct, we rely on the memory within PrimitiveArray being correctly aligned?

As described in the ticket, for IPC reader where we share same memory and slice with offset and length for individual arrays. The alignment check applies on the entire memory allocation not slices. There will be unexpected alignment check error.

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written this PR is unsound, as PrimitiveArray::values requires the values pointer to be correctly aligned.

@@ -878,6 +878,7 @@ mod tests {
assert_eq!(array.len(), 2);
assert_eq!(array.value(0), 0);
assert_eq!(array.value(1), 0);
assert_eq!(array.values(), &[0, 0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to call array.values() on the unaligned PrimitiveArray

@viirya
Copy link
Member Author

viirya commented Oct 16, 2022

As written this PR is unsound, as PrimitiveArray::values requires the values pointer to be correctly aligned.

The alignment soundness in PrimitiveArray::values is guaranteed by is_aligned_and_not_null. This is how it is implemented:

pub(crate) fn is_aligned_and_not_null<T>(ptr: *const T) -> bool {
    !ptr.is_null() && ptr.addr() % mem::align_of::<T>() == 0
}

It only checks if the ptr address is aligned with type T.

@tustvold
Copy link
Contributor

tustvold commented Oct 16, 2022

The alignment check applies on the entire memory allocation not slices.

The alignment of the entire memory allocation is irrelevant, we only care that that the buffers within it are correctly aligned. My memory is a bit fuzzy, but I seem to remember that the arrow specification goes to great lengths to document how data should be padded to guarantee alignment. This should mean that if the original allocation is aligned (something we should double-check we are actually guaranteeing), and the padding is correct, data can be zero-copy sliced - otherwise we have to copy.

let array = Int32Array::from(array_data);
assert_eq!(array.len(), 1);
assert_eq!(array.value(0), 0);
assert_eq!(array.values(), &[0]);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya
Copy link
Member Author

viirya commented Oct 16, 2022

The alignment of the entire memory allocation is irrelevant, we only care that that the buffers within it are correctly aligned.

align_offset actually checks the entire memory allocation. That said we allocate a memory allocation in IPC reader for all buffers data. Then we slice it for offset/length for each buffers without copying. When we construct PrimitiveArray from one (sliced) buffer, align_offset checks if the entire memory allocation aligns with type T. And it is obviously not guaranteed to be.

@tustvold
Copy link
Contributor

tustvold commented Oct 16, 2022

And it is obviously not guaranteed to be.

It is supposed to be - https://arrow.apache.org/docs/format/Columnar.html#buffer-alignment-and-padding by ensuring the buffers are padded correctly, we can ensure that correctly aligning the entire memory allocation is sufficient to align the child allocations. When the writer has not done this correctly, we will have to copy the buffers or return an error.

The alignment soundness in PrimitiveArray::values is guaranteed by is_aligned_and_not_null.

Where is this called, I think I am being blind. It looks like this PR removes the alignment check in PrimitiveArray?

@viirya
Copy link
Member Author

viirya commented Oct 16, 2022

Where is this called, I think I am being blind. It looks like this PR removes the alignment check in PrimitiveArray?

Hmm? No, this PR doesn't remove it.

Let me quote what I saw for now:

pub fn values(&self) -> &[T::Native] {
        // Soundness
        //     raw_values alignment & location is ensured by fn from(ArrayDataRef)
        //     buffer bounds/offset is ensured by the ArrayData instance.
        unsafe {
            std::slice::from_raw_parts(
                self.raw_values.as_ptr().add(self.data.offset()),
                self.len(),
            )
        }
    }
pub const unsafe fn from_raw_parts<'a, T>(data: *const T, len: usize) -> &'a [T] {
    // SAFETY: the caller must uphold the safety contract for `from_raw_parts`.
    unsafe {
        assert_unsafe_precondition!(
            is_aligned_and_not_null(data)
                && crate::mem::size_of::<T>().saturating_mul(len) <= isize::MAX as usize
        );
        &*ptr::slice_from_raw_parts(data, len)
    }
}
pub(crate) fn is_aligned_and_not_null<T>(ptr: *const T) -> bool {
    !ptr.is_null() && ptr.addr() % mem::align_of::<T>() == 0
}

BTW, I'm on a M1 Macbook so the toolchain is stable-aarch64-apple-darwin. Maybe you will see something different?

@tustvold
Copy link
Contributor

assert_unsafe_precondition is only checked in debug builds, although something is off as I would expect this to fire with your test.

My understanding of the test is you are explicitly creating a raw_values pointer that is not aligned to T, which should then trigger the assert and MIRI to fail?

@@ -861,16 +861,42 @@ mod tests {
}

#[test]
#[should_panic(expected = "memory is not aligned")]
#[should_panic(expected = "Need at least 8 bytes in buffers[0]")]
Copy link
Contributor

@tustvold tustvold Oct 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why MIRI isn't failing and the debug assertion isn't firing, the test panics before it does anything interesting... 🤦

@viirya
Copy link
Member Author

viirya commented Oct 16, 2022

It is supposed to be - https://arrow.apache.org/docs/format/Columnar.html#buffer-alignment-and-padding by ensuring the buffers are padded correctly, we can ensure that correctly aligning the entire memory allocation is sufficient to align the child allocations. When the writer has not done this correctly, we will have to copy the buffers or return an error.

How it can be?

That's said we allocate a 80 bytes memory allocation for all buffers containing 1 Int32Array (40 bytes) and and 1 Decimal128Array (32 bytes) and 1 Int32Array (8 bytes).

When we try to create array for the Decimal128Array, it takes the slice buffer offset by 40 to 80. Then it will fails at align_offset as it is not aligned with i128 (16 bytes).

@tustvold
Copy link
Contributor

tustvold commented Oct 16, 2022

How it can be?

Because the allocation containing all the buffers is aligned to at least 8 bytes, and all the contained buffers are padded to a multiple of 8 bytes in length, each buffer starts and ends at an 8 byte boundary.

As mentioned on the ticket, the issue appears to be that arm requires 16-byte alignment for i128 types, which isn't guaranteed by the standard which only mandates padding up to 8 bytes. As such we will need to copy the buffer to a new correctly aligned allocation in such a case. We could/should probably do this in general where the buffer is not sufficiently aligned for its type

@viirya
Copy link
Member Author

viirya commented Oct 16, 2022

Because the allocation containing all the buffers is aligned to at least 8 bytes, and all the contained buffers are padded to a multiple of 8 bytes in length, each buffer starts and ends at an 8 byte boundary.

Currently padding only guarantees 8 bytes alignment. Any larger alignment requirement can just fail the check.

I'm not sure if the official doc explicitly asks for a 8 bytes alignment. For example,

Implementations are recommended to allocate memory on aligned addresses (multiple of 8- or 64-bytes) and pad (overallocate) to a length that is a multiple of 8 or 64 bytes.

and for IPC,

The body, a flat sequence of memory buffers written end-to-end with appropriate padding to ensure a minimum of 8-byte alignment

It sounds like it can be any alignment larger than 8 bytes. Maybe we can change to 16 bytes alignment?

As mentioned on the ticket, the issue appears to be that arm requires 16-byte alignment for i128 types, which isn't guaranteed by the standard which only mandates padding up to 8 bytes. As such we will need to copy the buffer to a new correctly aligned allocation in such a case. We could/should probably do this in general where the allocation is not sufficiently aligned.

Hmm, for now this sounds like a special case (DecimalArray + arm). I'm okay for the copying approach. I will modify this.

@tustvold
Copy link
Contributor

tustvold commented Oct 16, 2022

It sounds like it can be any alignment larger than 8 bytes.

Correct

Maybe we can change to 16 bytes alignment?

For buffers allocated by arrow-rs we use larger alignments (32 bytes on arm, 128 bytes on x86) - see https://github.com/apache/arrow-rs/blob/master/arrow-buffer/src/alloc/alignment.rs. I presume this carries across to IPC files we write, but I have not verified this.

The issue is whatever wrote the test file in the ticket was only using the minimum 8 byte padding, and so we need to copy in such cases

Edit: It would appear we also only write with an alignment of 8 bytes, we should change this.

Hmm, for now this sounds like a special case (DecimalArray + arm)

I think it will also impact IntervalMonthDayNanoType which also uses i128

I'm okay for the copying approach

TBC we should only copy as a fallback for when the buffer is not sufficiently aligned. We could probably do this in the general case, it is better than panicking.

tustvold added a commit to tustvold/arrow-rs that referenced this pull request Oct 16, 2022
@viirya viirya changed the title Skip memory alignment check when constructing PrimitiveArray from an array data reference Copying inappropriately aligned buffer in ipc reader Oct 16, 2022
tustvold added a commit that referenced this pull request Oct 16, 2022
* Increase default IPC alignment to 64 (#2883)

* Update test

/// Calculate byte boundary and return the number of bytes needed to pad to `align_req` bytes
#[inline]
fn padding(len: usize, align_req: usize) -> usize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already handled for you by MutableBuffer, you shouldn't need to add explicit padding

Comment on lines 513 to 518
let len_in_bytes = length * std::mem::size_of::<i128>();
let pad_len = padding(len_in_bytes, align_req);
let mut aligned_buffer = MutableBuffer::with_capacity(len_in_bytes + pad_len);
aligned_buffer.extend_from_slice(&buffer.as_slice()[0..len_in_bytes]);
aligned_buffer.extend_from_slice(&vec![0u8; pad_len]);
aligned_buffer.into()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let len_in_bytes = length * std::mem::size_of::<i128>();
let pad_len = padding(len_in_bytes, align_req);
let mut aligned_buffer = MutableBuffer::with_capacity(len_in_bytes + pad_len);
aligned_buffer.extend_from_slice(&buffer.as_slice()[0..len_in_bytes]);
aligned_buffer.extend_from_slice(&vec![0u8; pad_len]);
aligned_buffer.into()
Buffer::from_slice_ref(buffer.as_slice)

Comment on lines 513 to 514
let len_in_bytes = length * std::mem::size_of::<i128>();
let slice = &buffer.as_slice()[0..len_in_bytes];
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need copying the range of current buffer (i.e., length).

// e.g. 8 bytes, but on some platform (e.g. ARM) i128 requires 16 bytes alignment.
// We need to copy the buffer as fallback.
if align_offset != 0 {
let len_in_bytes = length * std::mem::size_of::<i128>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is incorrect for the Decimal256 case? Perhaps we could make this method generic on the native type, to ensure the correct size and alignment is used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment req for i256 is also 16. But sounds better to make it generic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was actually the length I was concerned about

.len(length)
.add_buffer(buffers[1].clone())
.null_bit_buffer(null_buffer)
.build()
.unwrap(),
Decimal128(_, _) | Decimal256(_, _) => {
Interval(IntervalUnit::MonthDayNano) | Decimal128(_, _) | Decimal256(_, _) => {
let buffer = if matches!(data_type, &DataType::Decimal256(_, _)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not lift into parent match?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Save a few lines? :) I lifted it now.

// We need to copy the buffer as fallback.
if align_offset != 0 {
let len_in_bytes = length * std::mem::size_of::<T>();
let slice = &buffer.as_slice()[0..len_in_bytes];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, this will panic for invalid data where previously we would error. Should just be a case of taking the minimum of expected and actual length

@viirya viirya merged commit bfd87bd into apache:master Oct 16, 2022
@viirya
Copy link
Member Author

viirya commented Oct 16, 2022

Thanks @tustvold !

@ursabot
Copy link

ursabot commented Oct 16, 2022

Benchmark runs are scheduled for baseline = a3effc1 and contender = bfd87bd. bfd87bd is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory alignment error in RawPtrBox::new
3 participants