Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the initial peer success count for the fanout limit #2182

Closed

Conversation

teor2345
Copy link
Contributor

Motivation

Zebra is using the total number of seed peers for the fanout limit.

The limit will be too high if any of the initial peer connections fail.
(Typically, lots of those connections fail.)

Solution

  • Use the initial peer success count for the fanout limit
  • Make sure that the crawler starts after the first successful peer (or after all the initial handshake attempts have failed)

The code in this pull request has:

  • Documentation Comments
  • Integration Tests

Review

Whoever reviews #2181 should also review this task. (@dconnolly or @jvff)

Related Issues

Part of #2163

Based on PR #2181
Split off PR #2160

Follow Up Work

Rest of #2160

@teor2345 teor2345 added C-bug Category: This is a bug A-rust Area: Updates to Rust code P-Medium I-hang A Zebra component stops responding to requests I-slow Problems with performance or responsiveness A-network Area: Network protocol updates or fixes labels May 21, 2021
@teor2345 teor2345 requested review from dconnolly and jvff May 21, 2021 04:02
@teor2345 teor2345 self-assigned this May 21, 2021
@teor2345
Copy link
Contributor Author

I've put this PR in draft until #2181 merges to main.

Previously, Zebra was using the total number of seed peers, even if many
of those peers had failed.

This change also makes sure that the crawler starts after the first
successful peer. (Or after all the initial handshake attempts have
finished.)
@teor2345 teor2345 force-pushed the initial-handshake-success-count branch from 7511d05 to 4c49a2e Compare May 21, 2021 07:47
Comment on lines +174 to +180
info!("waiting for a successful initial peer connection");
// it doesn't matter if the sender has been dropped
let _ = initial_success_count_rx.changed().await;
info!(initial_successes = ?*initial_success_count_rx.borrow(),
"asking initial peers for new peers");
let _ = candidates
.update_initial(initial_peer_count_rx.await.expect("value sent before drop"))
.update_initial(*initial_success_count_rx.borrow())
Copy link
Contributor Author

@teor2345 teor2345 May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the initial peer success count for the fanout limit, Part 2

  • Wait for the first success count update
  • Start the crawler using that success count

Comment on lines +299 to +301
// if the receiver has been dropped, we still want to process
// the handshakes
let _ = success_count_tx.send(success_count);
Copy link
Contributor Author

@teor2345 teor2345 May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the initial peer success count for the fanout limit, Part 1a

  • Send the success count whenever we have a success
  • This also starts the crawler

@teor2345
Copy link
Contributor Author

This bug is fixed in #2160, but since #2160 integrates the initial handshakes into the crawler, the code is quite different.

I'll open a new PR for it.

@teor2345 teor2345 closed this May 25, 2021
@teor2345 teor2345 deleted the initial-handshake-success-count branch March 21, 2022 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-network Area: Network protocol updates or fixes A-rust Area: Updates to Rust code C-bug Category: This is a bug I-hang A Zebra component stops responding to requests I-slow Problems with performance or responsiveness
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants