Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

retains one node-instance per pubkey #17187

Merged
merged 1 commit into from
May 13, 2021

Conversation

behzadnouri
Copy link
Contributor

Problem

crds table retains up to 32 node-instance values per each pubkey. This
is so because if there are multiple running instances of the same node,
then we want gossip to propagate node-instance values associated with
both instances, therefore the corresponding label/key includes the
randomly generated token in addition to the pubkey:
https://github.com/solana-labs/solana/blob/9c42a89a4/core/src/crds_value.rs#L448
#14037

As a result, the number of such values per pubkey are effectively
unbounded, requiring custom mitigations implemented in: #14467
but still taking redundant extra memory and bandwidth.

Summary of Changes

This commit instead only retains one node-instance per pubkey by
extending crds values override logic. If a crds value is of type
node-instance, it will always override an existing one with the same key
if it has more recent starting timestamp (not wallclock). As a result,
gossip will always propagate the node-instance with more recent
timestamp. Since the check_duplicate logic will stop the node with older
timestamp, this change should preserve existing functionality.

@codecov
Copy link

codecov bot commented May 12, 2021

Codecov Report

Merging #17187 (f299b7e) into master (597373f) will increase coverage by 0.0%.
The diff coverage is 97.3%.

@@           Coverage Diff           @@
##           master   #17187   +/-   ##
=======================================
  Coverage    82.7%    82.7%           
=======================================
  Files         421      421           
  Lines      117926   117976   +50     
=======================================
+ Hits        97574    97640   +66     
+ Misses      20352    20336   -16     

crds table retains up to 32 node-instance values per each pubkey. This
is so because if there are multiple running instances of the same node,
then we want gossip to propagate node-instance values associated with
both instances, therefore the corresponding label/key includes the
randomly generated token in addition to the pubkey:
https://github.com/solana-labs/solana/blob/9c42a89a4/core/src/crds_value.rs#L448
solana-labs#14037

As a result, the number of such values per pubkey are effectively
unbounded, requiring custom mitigations implemented in:
solana-labs#14467
but still taking redundant extra memory and bandwidth.

This commit instead retains only one node-instance per pubkey by
extending crds values override logic. If a crds value is of type
node-instance, it will always override an existing one with the same key
if it has more recent starting timestamp (not wallclock). As a result,
gossip will always propagate the node-instance with more recent
timestamp. Since the check_duplicate logic will stop the node with older
timestamp, this change should preserve existing functionality.
@behzadnouri behzadnouri force-pushed the override-node-instance branch from 181b383 to f299b7e Compare May 12, 2021 19:51
@behzadnouri behzadnouri marked this pull request as ready for review May 12, 2021 19:51
@behzadnouri behzadnouri requested review from carllin and sakridge May 12, 2021 19:53
carllin
carllin approved these changes May 13, 2021
@behzadnouri behzadnouri merged commit 0aa7824 into solana-labs:master May 13, 2021
@behzadnouri behzadnouri deleted the override-node-instance branch May 13, 2021 13:35
mergify bot pushed a commit that referenced this pull request May 21, 2021
crds table retains up to 32 node-instance values per each pubkey. This
is so because if there are multiple running instances of the same node,
then we want gossip to propagate node-instance values associated with
both instances, therefore the corresponding label/key includes the
randomly generated token in addition to the pubkey:
https://github.com/solana-labs/solana/blob/9c42a89a4/core/src/crds_value.rs#L448
#14037

As a result, the number of such values per pubkey are effectively
unbounded, requiring custom mitigations implemented in:
#14467
but still taking redundant extra memory and bandwidth.

This commit instead retains only one node-instance per pubkey by
extending crds values override logic. If a crds value is of type
node-instance, it will always override an existing one with the same key
if it has more recent starting timestamp (not wallclock). As a result,
gossip will always propagate the node-instance with more recent
timestamp. Since the check_duplicate logic will stop the node with older
timestamp, this change should preserve existing functionality.

(cherry picked from commit 0aa7824)
mergify bot added a commit that referenced this pull request May 21, 2021
crds table retains up to 32 node-instance values per each pubkey. This
is so because if there are multiple running instances of the same node,
then we want gossip to propagate node-instance values associated with
both instances, therefore the corresponding label/key includes the
randomly generated token in addition to the pubkey:
https://github.com/solana-labs/solana/blob/9c42a89a4/core/src/crds_value.rs#L448
#14037

As a result, the number of such values per pubkey are effectively
unbounded, requiring custom mitigations implemented in:
#14467
but still taking redundant extra memory and bandwidth.

This commit instead retains only one node-instance per pubkey by
extending crds values override logic. If a crds value is of type
node-instance, it will always override an existing one with the same key
if it has more recent starting timestamp (not wallclock). As a result,
gossip will always propagate the node-instance with more recent
timestamp. Since the check_duplicate logic will stop the node with older
timestamp, this change should preserve existing functionality.

(cherry picked from commit 0aa7824)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
@brooksprumo brooksprumo mentioned this pull request Aug 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants