Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genesis milestone 12 - Chain Sync Jumping #1033

Merged
merged 20 commits into from
May 29, 2024

Conversation

nbacquey
Copy link
Contributor

@nbacquey nbacquey commented Apr 2, 2024

The main contribution of this PR is the implementation of ChainSync Jumping (0e265c4), a mechanism to avoid overloading the honest peers in the network when one or more peers connect to them for syncing. The details of ChainSync Jumping are discussed in the Jumping module.

A range of supporting commits are provided in addition.

More tests

  • The time limited leashing attack test is configured and enabled in 6c25310. This test is aimed at the GDD governor implemented in Implement the GDD governor #1015.
  • A test to show that a node can resume syncing after being disconnected for a while has been implemented in 5d3c200.
  • There is a test checking that the GDD governor doesn't regret disconnecting nodes as more headers are known to it (dbbc3ab ). This was included in the previous milestone but we discovered later it wasn't checking as much as we intended.
  • In da7a6db and 3220e7d there is a test for ChainSync Jumping that ensures that a syncing node downloads headers from at most one peer when there is no disagreement between the peers.
  • There are various fixes to tests and documentation in d6f6ddf, b0ed1c8, d70e604 , cee466f, and dbbc3ab.

GDD refinement

The ability of the GDD governor to disconnect nodes has increased in dcb20f5. With respect to #1015, the GDD governor can now disconnect peers in the following scenarios:

  1. when a peer has a chain that can't be denser than another chain served by another peer (before, we would disconnect the peer only if there was a denser chain); or
  2. when a peer has a chain of density 0 and it claims to have more headers after the genesis window; or
  3. when a peer sent a header for the last slot of the genesis window or later, and loses the density comparison to another peer (before, we would disconnect the peer only after it loses the density comparison to a peer that sent more than k headers in the genesis window).

These changes help ChainSync Jumping make progress while downloading headers from only two disagreeing peers (even if both are adversarial).

Other changes

  • A version of shrinking of honest schedules of messages has been implemented in 9794292.
  • 9572256 contains a fix to the implementation of Limit on Patience.
  • 915238e implements a recording tracer that doesn't affect scheduling in IOSim.

@nbacquey nbacquey added the Genesis PRs related to Genesis testing and implementation label Apr 2, 2024
@amesgen amesgen force-pushed the genesis/milestone-11-gdd-governor-squashed branch from 696280a to cae9d98 Compare April 15, 2024 15:06
Base automatically changed from genesis/milestone-11-gdd-governor-squashed to main April 16, 2024 10:46
@facundominguez facundominguez force-pushed the genesis/milestone-12-chain-sync-jumping branch from c0ec3b6 to a2bfae0 Compare April 24, 2024 23:02
@facundominguez facundominguez changed the title [WIP] Genesis milestone 12 - Chain Sync Jumping Genesis milestone 12 - Chain Sync Jumping Apr 25, 2024
@tek tek force-pushed the genesis/milestone-12-chain-sync-jumping branch 2 times, most recently from 664722d to 8400afc Compare April 26, 2024 12:25
@facundominguez facundominguez force-pushed the genesis/milestone-12-chain-sync-jumping branch 3 times, most recently from b34ada7 to 3164620 Compare April 26, 2024 20:28
@Niols Niols force-pushed the genesis/milestone-12-chain-sync-jumping branch 2 times, most recently from d57042c to 4d8c607 Compare April 30, 2024 18:12
@facundominguez facundominguez force-pushed the genesis/milestone-12-chain-sync-jumping branch 2 times, most recently from 3dc8fe2 to ed5e5af Compare May 2, 2024 14:17
@Niols Niols force-pushed the genesis/milestone-12-chain-sync-jumping branch from 84fce19 to 5db13db Compare May 2, 2024 16:56
@facundominguez facundominguez force-pushed the genesis/milestone-12-chain-sync-jumping branch from 5db13db to f6764e7 Compare May 2, 2024 17:16
@facundominguez facundominguez marked this pull request as ready for review May 2, 2024 18:30
@facundominguez facundominguez requested a review from a team as a code owner May 2, 2024 18:30
@jasagredo jasagredo self-requested a review May 16, 2024 13:57
Comment on lines 105 to 109
then "All headers may be downloaded twice"
else "There exist headers that have to be downloaded exactly once"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the logic the other way round? i.e. either all headers have to be downloaded once or there exist some headers that could be downloaded twice?

Or maybe I'm misunderstanding this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meaning of headerDownloadEvents may require a comment.

all headers have to be downloaded once

We always expect some headers to be downloaded more than once. The first time is when the first dynamo serves them. It reaches the end of the chain, and it is disengaged. The next time will be when the next dynamo serves the headers since its last jump as a jumper. It reaches the end of the chain, and it is disengaged. And so on ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to clarify the meaning of headerDownloadEvents in 980c35a.

Comment on lines 105 to 109
then "All headers may be downloaded twice"
else "There exist headers that have to be downloaded exactly once"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meaning of headerDownloadEvents may require a comment.

all headers have to be downloaded once

We always expect some headers to be downloaded more than once. The first time is when the first dynamo serves them. It reaches the end of the chain, and it is disengaged. The next time will be when the next dynamo serves the headers since its last jump as a jumper. It reaches the end of the chain, and it is disengaged. And so on ...

Copy link
Contributor

@jasagredo jasagredo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objections about merging this.

nbacquey and others added 3 commits May 29, 2024 14:40
* Use real bucket parameters to compute time bounds
* Add classifiers for empty/trivial adversarial schedules
* Allow fetching of the same block from many peers
* Time limited leashing attack: Extend the schedule and disable timeouts
* Change BlockFetch initialization to use FetchModeDeadline

Co-authored-by: Alexander Esgen <alexander.esgen@iohk.io>
Co-authored-by: Nicolas BACQUEY <nicolas.bacquey@tweag.io>
 * Disconnect peers with 0 density

   This affects ChainSync jumping, where genesis windows with no headers
   prevent jumps from happening.

 * Have GDD disconnect peers with disagreeing chains of equal density

   Before this change, the GDD governor would tolerate different chains
   having the same density. This is problematic to ChainSync jumping if
   the only active peers (dynamo and objector) disagree with chains of
   the same density, and GDD cannot disconnect any of them.

 * Don't mix idling in the computation of the upper bound

 * Simplify the condition to disconnect 0-density peers

 * Note the redundancy when checking for disagrement in chains

 * Peers with less than k+1 headers can disconnect peers which sent all headers in the genesis window

 * Rename fragment to clippedFragment in GDD
facundominguez and others added 17 commits May 29, 2024 14:40
The previous implementation would not pause the bucket while waiting for the
time translation horizon to advance.
See the documentation in
ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/MiniProtocol/ChainSync/Client/Jumping.hs
for an overview.
Co-authored-by: Nicolas “Niols” Jeannerod <nicolas.jeannerod@tweag.io>
Co-authored-by: Nicolas “Niols” Jeannerod <nicolas.jeannerod@tweag.io>
Co-authored-by: Torsten Schmits <git@tryp.io>
The test logic failed to ever update the `killed` flag that indicated
that a peer was selected for disconnection before, essentially making
the test tautological.

This revealed the need for another fix – the test must be terminated
when the LOE fragment advances beyond the honest chain (because a branch
forks off the last block).
Co-authored-by: Javier Sagredo <javier.sagredo@iohk.io>
@nbacquey nbacquey force-pushed the genesis/milestone-12-chain-sync-jumping branch from 980c35a to ce3b2c6 Compare May 29, 2024 15:37
@nbacquey nbacquey enabled auto-merge May 29, 2024 15:37
@nbacquey nbacquey added this pull request to the merge queue May 29, 2024
Merged via the queue into main with commit 18f43bd May 29, 2024
18 checks passed
@nbacquey nbacquey deleted the genesis/milestone-12-chain-sync-jumping branch May 29, 2024 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Genesis PRs related to Genesis testing and implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants