-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reputation updates are handled inefficiently #2257
Comments
We should probably re-evaluate how critical reputation is. It's way easier to track negatives than positives. Atomic increment operations sound faster than anything complex like Mutex here. |
I agree that we should re-evaluate it because right now for example banning is not working as effectively as it could. But I believe this could be an issue short-term so we should have some fix for it, even if it's not perfect or is meant only as a temporary workaround before the system is redesigned. I didn't understand your point about atomic increments though. Reputations are updated in multiple different protocols for different peers so the set of connected peers has to be synchronized between the protocols somehow. |
Indeed. In parachain consensus we have peers in the reserved set. I wondered for quite some time how much point reputations really have and whether we should not just ensure that we are processing all (validator) connections fairly and be done with it. E.g. if a validator is flooding us, but we make sure that it only really affects our abilities to process messages from this very validator, then let him flood - who cares? |
Atomic increments need only a |
Afaik it's premature to rip out reputation. It'd just be better if we really understood what reputation does. Also, we know cheaper flavors which maybe suffice:
|
I agree and I'm not suggesting we alter the behavior before we understand what we want from it. Like I mentioned in OP, if we wish to keep the current system, our protocols should be refactored to not update the reputation on every message. For example, the transaction protocol sends two reputation updates per transaction: one to initially decrease the reputation and one to refund/reward after the TX has been validated. I think on the Polkadot side reputations updates are already aggregated to some extent which helps with this situation. I think we should find uses cases for reputations if we want to have that system. Banning is useful but other than that we only use the reputations for outbound connections for non-reserved peers. Maybe that's enough use for reputations? |
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
There is no channel after #2392 was merged, is this still a practical issue or can this be closed now? |
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
* allow delivery confirmations on closed outbound lanes * fix benchmarks compilation
While debugging something else, I looked into how reputation updates are handled in
sc-network
. Currently each protocol reports reputation updates overNetworkPeers::report_peer()
. This sends a message over an unbounded channel toNetworkWorker
which updates the reputation of the peer.With 200 connections, the channel length peaked at 7000+ messages, most of them reputation updates coming from GRANDPA/transactions. This puts a lot of pressure on
NetworkWorker
which is already busy with notifications/requests. ConvertingNetworkService
to update the reputation by directly callingPeerStore
was also not optimal because there is a lot of contention on the mutex protectingPeerStore
which caused the node to become visibly laggier as both GRANDPA and transactions are busy updating the reputation.Code that reports peers should be refactored to be more aware of the cost of updating the reputation and it should try batching the updates if possible. Alternatively, introducing per-protocol reputations (side-stepping contention on the global mutex) which would be combined into one reputation by
PeerStore
at periodic intervals could help with the issue.The text was updated successfully, but these errors were encountered: