Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use MAX_PREALLOCATION consistently #605

Merged
merged 3 commits into from
Jul 19, 2024

Conversation

serban300
Copy link
Contributor

@serban300 serban300 commented Jun 19, 2024

Related to #609

Use MAX_PREALLOCATION both when reading a vec from bytes and when decoding each element.

@serban300 serban300 self-assigned this Jun 19, 2024
@serban300 serban300 marked this pull request as draft June 19, 2024 07:15
@serban300 serban300 changed the title Use MAX_PREALLOCATION consistently [WIP] Use MAX_PREALLOCATION consistently Jun 19, 2024
@serban300 serban300 changed the title [WIP] Use MAX_PREALLOCATION consistently Use MAX_PREALLOCATION consistently Jun 19, 2024
@serban300 serban300 marked this pull request as ready for review June 19, 2024 07:34
@serban300 serban300 requested review from bkchr and koute July 17, 2024 12:15
@serban300
Copy link
Contributor Author

@bkchr @koute could you PTAL on this PR since it's somewhat related to #609 ? Even though it's not a prerequisite, I think it would be nice to have.

Use `MAX_PREALLOCATION` both when reading a vec from bytes and when
decoding each element.
Increase MAX_PREALLOCATION in order to avoid calling realloc too often
@serban300 serban300 merged commit 36baa4f into paritytech:master Jul 19, 2024
16 of 17 checks passed
// If there is input len and it cannot be pre-allocated then return directly.
if input_len.map(|l| l < byte_len).unwrap_or(false) {
return Err("Not enough data to decode vector".into());
num_undecoded_items = num_undecoded_items.saturating_sub(chunk_len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This saturating_sub's completely unnecessary here since impossible to have chunk_len > num_undecoded_items due to the min.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #615

Comment on lines +1129 to +1133
if let Some(input_len) = input.remaining_len()? {
if input_len < len {
return Err("Not enough data to decode vector".into());
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct as deserializing T might take any number of bytes (including even zero bytes, e.g. ()).

What we should do here is to have a serialized_size_hint() method (or, more specifically, probably an associated const so that it can be checked statically to fit within MAX_PREALLOCATION) or something like that on T which would return a value that could allow this check. (We already have encoded_fixed_size there, but that returns an exact number of bytes; it could be used here, but technically that's too strict and we can do better here by using the minimum.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just drop this check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, alternatively we can drop it. Although having it here can have one benefit - if we end up not having enough data then this will return an early error instead of wasting time trying to deserialize it. Nice to have, but not strictly necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To implement this, we would need to write quite a lot of code. For example for an enum we would need to know the variant that requires the least amount of bytes. However, it could then still fail at decoding because we try to decode always the enum variant that uses much more bytes etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, well, would it be that much code? I implemented this in my serialization crate and it's mostly fine; with enums you essentially just autogenerate a (min(variant1, variant2, ..), max(variant1, variant2, ..)) in your impl. Of course this is just an optimization (in some cases it would make incomplete deserializations fail early, and in some cases it would allow the compiler to remove per-element size checks), and as you've said it can still fail at decoding depending on what you're decoding.

Anyway, I'm fine with going with your suggestion to just delete the check.

Copy link
Contributor Author

@serban300 serban300 Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it for the moment: #615

where
I: Input,
T: ToMutByteSlice + Default + Clone,
F: FnMut(&mut Vec<T>, usize) -> Result<(), Error>,
{
debug_assert!(MAX_PREALLOCATION >= mem::size_of::<T>(), "Invalid precondition");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this into a static assert and check it at compile time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't manage to do this so far. I tried something like

const _: () = {
    assert!(MAX_PREALLOCATION >= mem::size_of::<T>())
}

inside decode_vec_chunked()

But I'm getting an error: can't use generic parameters from outer item.

Any suggestion would be helpful

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to define a constant; since Rust 1.79 you can use a const {} block to force const evaluation of an expression.

Copy link
Contributor Author

@serban300 serban300 Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this works, thanks ! PTAL on #615

But the CI fails, because the CI image uses rust 1.73.0 . We can try to release a paritytech/ci-unified:bullseye-1.79.0 image. Will check how this can be done.

@serban300
Copy link
Contributor Author

serban300 commented Jul 22, 2024

Thanks for the review ! Will address the comments in a new PR today.

LE: Here is the PR: #615

@serban300 serban300 mentioned this pull request Jul 22, 2024
serban300 added a commit that referenced this pull request Jul 23, 2024
* Address #605 code review comments

* Check MAX_PREALLOCATION >= mem::size_of::<T> statically

* Update CI image to paritytech/ci-unified:bullseye-1.79.0

This reverts commit c54689d.
@niklasad1 niklasad1 mentioned this pull request Oct 24, 2024
@jsdw jsdw mentioned this pull request Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Status: Audited
Development

Successfully merging this pull request may close these issues.

3 participants