Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

limits to data_header.size when combining shreds' payloads #16708

Merged
merged 1 commit into from
Apr 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions core/src/replay_stage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2948,6 +2948,8 @@ pub(crate) mod tests {
let gibberish = [0xa5u8; PACKET_DATA_SIZE];
let mut data_header = DataShredHeader::default();
data_header.flags |= DATA_COMPLETE_SHRED;
// Need to provide the right size for Shredder::deshred.
data_header.size = SIZE_OF_DATA_SHRED_PAYLOAD as u16;
let mut shred = Shred::new_empty_from_header(
ShredCommonHeader::default(),
data_header,
Expand Down
55 changes: 27 additions & 28 deletions ledger/src/shred.rs
Original file line number Diff line number Diff line change
Expand Up @@ -786,7 +786,7 @@ impl Shredder {
.iter()
.map(|shred| &shred.payload[..PAYLOAD_ENCODE_SIZE])
.collect();
let mut parity = vec![vec![0; PAYLOAD_ENCODE_SIZE]; num_coding];
let mut parity = vec![vec![0u8; PAYLOAD_ENCODE_SIZE]; num_coding];
Session::new(num_data, num_coding)
.unwrap()
.encode(&data, &mut parity[..])
Expand Down Expand Up @@ -938,37 +938,36 @@ impl Shredder {

/// Combines all shreds to recreate the original buffer
pub fn deshred(shreds: &[Shred]) -> std::result::Result<Vec<u8>, reed_solomon_erasure::Error> {
let num_data = shreds.len();
use reed_solomon_erasure::Error::TooFewDataShards;
const SHRED_DATA_OFFSET: usize = SIZE_OF_COMMON_SHRED_HEADER + SIZE_OF_DATA_SHRED_HEADER;
Self::verify_consistent_shred_payload_sizes(&"deshred()", shreds)?;
let data_shred_bufs = {
let first_index = shreds.first().unwrap().index() as usize;
let last_shred = shreds.last().unwrap();
let last_index = if last_shred.data_complete() || last_shred.last_in_slot() {
last_shred.index() as usize
} else {
0
};

if num_data.saturating_add(first_index) != last_index.saturating_add(1) {
return Err(reed_solomon_erasure::Error::TooFewDataShards);
}

shreds.iter().map(|shred| &shred.payload).collect()
let index = shreds.first().ok_or(TooFewDataShards)?.index();
let aligned = shreds.iter().zip(index..).all(|(s, i)| s.index() == i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the indexes don't align, that would imply something is disastrously wrong with the blockstore insertion code/corruption might have occurred. In these cases, can we hard panic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, but since this is old code and a public function I would rather not to introduce a hard panic for now (at least in this commit), until all current call paths are vetted.

let data_complete = {
let shred = shreds.last().unwrap();
shred.data_complete() || shred.last_in_slot()
};

Ok(Self::reassemble_payload(num_data, data_shred_bufs))
}

fn reassemble_payload(num_data: usize, data_shred_bufs: Vec<&Vec<u8>>) -> Vec<u8> {
let valid_data_len = SHRED_PAYLOAD_SIZE - SIZE_OF_CODING_SHRED_HEADERS;
data_shred_bufs[..num_data]
if !data_complete || !aligned {
return Err(TooFewDataShards);
}
let data: Vec<_> = shreds
.iter()
.flat_map(|data| {
let offset = SIZE_OF_COMMON_SHRED_HEADER + SIZE_OF_DATA_SHRED_HEADER;
data[offset..valid_data_len].iter()
.flat_map(|shred| {
let size = shred.data_header.size as usize;
let size = shred.payload.len().min(size);
let offset = SHRED_DATA_OFFSET.min(size);
shred.payload[offset..size].iter()
})
.cloned()
.collect()
.copied()
.collect();
if data.is_empty() {
// For backward compatibility. This is needed when the data shred
// payload is None, so that deserializing to Vec<Entry> results in
// an empty vector.
Ok(vec![0u8; SIZE_OF_DATA_SHRED_PAYLOAD])
Copy link
Contributor

@carllin carllin Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we know the input shreds isn't empty, otherwise verify_consistent_shred_payload_sizes() above would've errored out with TooFewShardsPresent, then for this comment:

This is needed when the data shred payload is None

Does this mean when all the shred.payload[offset..size].iter() above return empty, and if so can we add a test case for this?

For backward compatibility...

This is probably something subtle, but I think I'm missing the equivalent case in the current code 😄 , is there a scenario where we're returning vec![0u8; SIZE_OF_DATA_SHRED_PAYLOAD]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed in the case data payload is None (already mentioned in the comment):
https://github.com/solana-labs/solana/blob/68d5aec55/ledger/src/shred.rs#L252

You can have one shred with data == None, and so data_header.size == 0, which results in an empty output, but existing code assumes that we get a vector of zeros.

For example in broadcast:
https://github.com/solana-labs/solana/blob/68d5aec55/core/src/broadcast_stage/standard_broadcast_run.rs#L79

and without above code branch, this test fails:
https://github.com/solana-labs/solana/blob/68d5aec55/core/src/broadcast_stage/standard_broadcast_run.rs#L580

As commented, this is just to maintain backward compatibility with existing callers, and not to break them.

} else {
Ok(data)
}
}

fn verify_consistent_shred_payload_sizes(
Expand Down