-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use MAX_PREALLOCATION
consistently
#605
Conversation
MAX_PREALLOCATION
consistentlyMAX_PREALLOCATION
consistently
MAX_PREALLOCATION
consistentlyMAX_PREALLOCATION
consistently
Use `MAX_PREALLOCATION` both when reading a vec from bytes and when decoding each element.
Increase MAX_PREALLOCATION in order to avoid calling realloc too often
// If there is input len and it cannot be pre-allocated then return directly. | ||
if input_len.map(|l| l < byte_len).unwrap_or(false) { | ||
return Err("Not enough data to decode vector".into()); | ||
num_undecoded_items = num_undecoded_items.saturating_sub(chunk_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This saturating_sub
's completely unnecessary here since impossible to have chunk_len > num_undecoded_items
due to the min
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in #615
if let Some(input_len) = input.remaining_len()? { | ||
if input_len < len { | ||
return Err("Not enough data to decode vector".into()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't correct as deserializing T
might take any number of bytes (including even zero bytes, e.g. ()
).
What we should do here is to have a serialized_size_hint()
method (or, more specifically, probably an associated const so that it can be checked statically to fit within MAX_PREALLOCATION
) or something like that on T
which would return a value that could allow this check. (We already have encoded_fixed_size
there, but that returns an exact number of bytes; it could be used here, but technically that's too strict and we can do better here by using the minimum.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just drop this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, alternatively we can drop it. Although having it here can have one benefit - if we end up not having enough data then this will return an early error instead of wasting time trying to deserialize it. Nice to have, but not strictly necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To implement this, we would need to write quite a lot of code. For example for an enum
we would need to know the variant that requires the least amount of bytes. However, it could then still fail at decoding because we try to decode always the enum variant that uses much more bytes etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, well, would it be that much code? I implemented this in my serialization crate and it's mostly fine; with enum
s you essentially just autogenerate a (min(variant1, variant2, ..), max(variant1, variant2, ..))
in your impl. Of course this is just an optimization (in some cases it would make incomplete deserializations fail early, and in some cases it would allow the compiler to remove per-element size checks), and as you've said it can still fail at decoding depending on what you're decoding.
Anyway, I'm fine with going with your suggestion to just delete the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it for the moment: #615
where | ||
I: Input, | ||
T: ToMutByteSlice + Default + Clone, | ||
F: FnMut(&mut Vec<T>, usize) -> Result<(), Error>, | ||
{ | ||
debug_assert!(MAX_PREALLOCATION >= mem::size_of::<T>(), "Invalid precondition"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make this into a static assert and check it at compile time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't manage to do this so far. I tried something like
const _: () = {
assert!(MAX_PREALLOCATION >= mem::size_of::<T>())
}
inside decode_vec_chunked()
But I'm getting an error: can't use generic parameters from outer item
.
Any suggestion would be helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to define a constant; since Rust 1.79 you can use a const {}
block to force const evaluation of an expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this works, thanks ! PTAL on #615
But the CI fails, because the CI image uses rust 1.73.0 . We can try to release a paritytech/ci-unified:bullseye-1.79.0
image. Will check how this can be done.
Thanks for the review ! Will address the comments in a new PR today. LE: Here is the PR: #615 |
Related to #609
Use
MAX_PREALLOCATION
both when reading a vec from bytes and when decoding each element.