Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

records hash of timed-out pull responses #17263

Merged
merged 1 commit into from
May 22, 2021

Conversation

behzadnouri
Copy link
Contributor

Problem

Gossip should record hash of pull responses which are timed out and
fail to insert:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L397-L400

so that they are excluded from the next pull request:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L486-L491

otherwise the next pull request will likely include the same timed out
values and waste bandwidth.

Summary of Changes

record hash of timed-out pull responses

@behzadnouri behzadnouri marked this pull request as ready for review May 16, 2021 18:51
@behzadnouri behzadnouri force-pushed the record-timedout branch 2 times, most recently from 96c0150 to 3fd6f5f Compare May 17, 2021 15:28
@behzadnouri behzadnouri requested review from sakridge and carllin May 17, 2021 15:28
@codecov
Copy link

codecov bot commented May 17, 2021

Codecov Report

Merging #17263 (46e4b65) into master (32ec834) will increase coverage by 0.0%.
The diff coverage is 84.6%.

@@           Coverage Diff           @@
##           master   #17263   +/-   ##
=======================================
  Coverage    82.6%    82.7%           
=======================================
  Files         423      423           
  Lines      118111   118111           
=======================================
+ Hits        97663    97690   +27     
+ Misses      20448    20421   -27     

Gossip should record hash of pull responses which are timed out and
fail to insert:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L397-L400

so that they are excluded from the next pull request:
https://github.com/solana-labs/solana/blob/ed51cde37/core/src/crds_gossip_pull.rs#L486-L491

otherwise the next pull request will likely include the same timed out
values and waste bandwidth.
Comment on lines +398 to +402
let failed_inserts = responses
.into_iter()
.filter_map(upsert)
.map(|resp| hash(&bincode::serialize(&resp).unwrap()))
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, these values eventually make it into the CrdsGossipPull::failed_inserts here: https://github.com/solana-labs/solana/blob/master/core/src/crds_gossip_pull.rs#L446-L447. Could this vector now grow large/potentially have duplicates for the same hash response (if we got the same outdate response from multiple validators for instance) because we removed the crds.upserts(&response) check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 if branches here:
https://github.com/solana-labs/solana/blob/5e6b00fe9/core/src/crds_gossip_pull.rs#L390-L399
The first 2, push the value to failed_inserts if crds.upsert(&response) returns false.
The 3rd one, basically discards a pull-response, but does not record its hash any where. This change is updating the 3rd branch to record the hash of discarded value on this branch as well to exclude it from the next pull-request.

So, for the first 2 branches noting has changed; Only the 3rd branch now also records hash of discarded value like the first 2 branches.

Could this vector now grow large/potentially have duplicates for the same hash response

That is a good point. Ideally if the hash value is recorded, then it is excluded from the next pull-request (by adding it to the bloom filters), and so it should not appear in the pull responses again (until it is purged from failed_inserts vec-deque). So as long as the failed_inserts vector is updated before the next pull request, then they should be excluded from later pull responses, and we wouldn't expect to get duplicates.

But still, I agree that potentially this vector can get too large, and duplicates may creep in somehow (e.g. if pull-responses arrive very late). I took a note to address these, perhaps add an upper bound for the size of failed_inserts in a follow up change.

because we removed the crds.upserts(&response) check?

I do not think I removed it. It is still there in filter_map closure.

@behzadnouri behzadnouri merged commit a7870cd into solana-labs:master May 22, 2021
@behzadnouri behzadnouri deleted the record-timedout branch May 22, 2021 17:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants