-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GossipSub: Limit flood publishing #911
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## unstable #911 +/- ##
============================================
- Coverage 83.51% 83.46% -0.05%
============================================
Files 91 91
Lines 15144 15145 +1
============================================
- Hits 12647 12641 -6
- Misses 2497 2504 +7
|
msgSize = data.len | ||
bandwidth = 25_000_000 #TODO replace with bandwidth estimate | ||
msToTransmit = max(msgSize div (bandwidth div 1000), 1) | ||
maxFloodPublish = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for a min
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the messages are that big (2mb), it will effectively disable flood publish
The idea of this PR is that flood publish shouldn't last longer than one heartbeat, since after that we will be busy responding to IWANT requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider the case where we're not subscribed to the topic - sending to 0 peers seems a bit harsh then...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that won't happen before >50mb, but still good to cover it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought a bit more about this, and I think we want to put the minimum at dmin at least ..
it feels very risky to send only to one peer - if that peer is slow, it'll delay the message for all peers by a heartbeat and it relies a bit too heavily on the IHAVE/IWANT mechanism to recover, ie there's no redundancy...
I know there's a cost here for sending big messages from slow peers, but I think the risk for normal/highbandwidth peers is more real and costly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The minimum is currently dLow (it will be caught by the if peers.len < g.parameters.dLow:
below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't fanout peers populated only when we're not subscribed? also as such, there might not be enough of them in the fanout table either?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also use them when we can't publish to enough peers (ie bad mesh), and they are replenished when < dLow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also use them when we can't publish to enough peers (ie bad mesh), and they are replenished when < dLow
so sending to g.gossipsub[topic]
as a last resort wouldn't be appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the same as using the fanout, the fanout is only a small cache of gossipsub
to have more stable diffusion routes, basically
after 1 heartbeat, should we sent an IHAVE to all peers we didn't flood it to? |
Not sure thats wise, most likely they will all reply with IWANTs, and we will be back to square one |
hm, what might make sense here is an exponential increase per heartbeat .. if we send message to 4 peers, we iwant 4 peers at the heartbeat, then 8 and so on for each heartbeat that passes - that should help avoid eclipsing while at the same time allowing the message to spread before |
if g.parameters.floodPublish: | ||
let | ||
msgSize = data.len | ||
bandwidth = 25_000_000 #TODO replace with bandwidth estimate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect we need to make this a (debug) parameter until we have something better - ie in nimbus-eth2, this would be a hidden command-line parameter that we pull out in case of weird behavior - hidden, because obviously it's something that should go away at some point
@diegomrsantos not sure I like the refacto you did, the splits seems a bit random, and the flow became harder to follow IMO Not sure if g.parameters.floodPublish:
addFloodPublishPeers(g, topic, data)
if peers.len < g.parameters.dLow:
# not subscribed, or bad mesh, send to fanout peers
addFanoutPeers(g, topic, peers) |
@Menduist the original code was confusing to me, but feel free to remove or modify the changes. |
Part of #854
This PR adds a limit to the flood publishing, so that we spend at most 1 heartbeat flood publishing
Of course, this requires to know our bandwidth, which is currently hardcoded to 100 mbps, but this will be replaced by an actual estimate later
With 700ms heartbeat and 1mb message size, we will send it to a maximum of 17 peers (including the mesh)
The "1 heartbeat" period is used because after that, we will probably be responding to IWANTs
Next step for flood publishing will be to integrate #850 (staggered sending), which will not only give us the bandwidth estimate, but also send first to the mesh and then the flood.