Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kad: Failed to register opened substream to protocol #300

Closed
lexnv opened this issue Dec 10, 2024 · 0 comments · Fixed by #301
Closed

kad: Failed to register opened substream to protocol #300

lexnv opened this issue Dec 10, 2024 · 0 comments · Fixed by #301

Comments

@lexnv
Copy link
Collaborator

lexnv commented Dec 10, 2024

Kusama validator sees a high number of failed to register opened substream to protocol, this is cascading to other warnings.

The validator is running for approx 14 days without interruption kusama-validator-bhs5-0, other validators don't seem impacted at the moment.

Repo            | Count      | Level      | Triage report

https://github.com/paritytech/polkadot-sdk/ | 35638      | warn_if_frequent | Some network error occurred when fetching erasure chunk
https://github.com/paritytech/litep2p/ | 18736      | error      | failed to register opened substream to protocol
https://github.com/paritytech/polkadot-sdk/ | 475        | warn       | Data unavailable for candidate .*
https://github.com/paritytech/polkadot-sdk/ | 475        | warn       | Recovery of available data failed.
https://github.com/paritytech/polkadot-sdk/ | 9          | warn       | fetch_pov_job
https://github.com/paritytech/polkadot-sdk/ | 9          | warn       | Cluster has too many pending statements, something wrong with our connection to our group peers
https://github.com/paritytech/polkadot-sdk/ | 3          | warn       | Report .*: .* to .*. Reason: .*. Banned, disconnecting. ( Same block request multiple times. Banned, disconnecting.)
https://github.com/paritytech/polkadot-sdk/ | 2          | warn       | Report .*: .* to .*. Reason: .*. Banned, disconnecting. ( A collator provided a collation for the wrong para. Banned, disconnecting.)
https://github.com/paritytech/litep2p/ | 1          | error      | failed to register substream open failure to protocol
https://github.com/paritytech/polkadot-sdk/ | 1          | warn       | .*: .* is already a reserved peer
https://github.com/paritytech/litep2p/ | 1          | error      | failed to send mdns query

The warnings are related to not being able to propagate open substreams to kademlia in this case:

2024-12-10 16:00:00.096 ERROR tokio-runtime-worker litep2p::tcp::connection: failed to register opened substream to protocol protocol=Allocated("/b0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe/kad") peer=PeerId("12D3KooWSKhrnPPYmAAfpA8TzE6NLkDooMzm7bApLYk86TNBmfcp") endpoint=Listener { address: "/ip6/2a02:1210:821b:7f00:ce28:aaff:fe0f:2762/tcp/37594", connection_id: ConnectionId(31845931) } error=ConnectionClosed

We are also seeing a strange failed to send mdns query. This tries to send a message on multicast address 224.0.0.251, which the operating system has reported as NetworkUnreachable. The error might suggest an issue with the networking interfaces of the instance:

tokio-runtime-worker litep2p::mdns: failed to send mdns query error=IoError(NetworkUnreachable)

cc @paritytech/networking

lexnv added a commit that referenced this issue Dec 11, 2024
This PR replaces the identify `FuturesUnordered` with `FuturesStream`.
This effectively fixes delays in processing outbound events.
- ensure that identify warns if the transport service is closed
(produces no events).
- identify no longer exits on pending outbound events

Related to:
- #287
- #300

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Dmitry Markin <dmitry@markin.tech>
@lexnv lexnv closed this as completed in ef495b8 Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant