Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yamux: Switch to upstream implementation while keeping the controller API #320

Merged
merged 9 commits into from
Jan 29, 2025

Conversation

lexnv
Copy link
Collaborator

@lexnv lexnv commented Jan 23, 2025

This PR relies on the libp2p-yamux crate for the core functionality of our multiplexer.
The main goal is to bring complete compatibility between libp2p and litep2p on the yamux layer, and remove 90% of the yamux code in favor of the upstream implementation while keeping the controller API in place.

The upstream crate brings in multiple fixes for substreams while minimally impacting our dependency tree.

The downside of the upstream implementation is that the controller API has been removed. Adjusting to the new API would be a massive breaking change for all transport layers. Therefore, we keep the controller API which integrates seamlessly with the upstream yamux. No other changes were present to the controller API in the upstream implementation.

Yamux changelog

The changelog includes the fixes from the upstream since the moment we have inlined the crate in litep2p:

# 0.13.4

- Fix sending pending frames after closing. See [PR 194](https://github.com/libp2p/rust-yamux/pull/194).

# 0.13.3

- Wake up readers after setting the state to RecvClosed to not miss EOF.
  See [PR 190](https://github.com/libp2p/rust-yamux/pull/190).

- Use `web-time` instead of `instant`.
  See [PR 191](https://github.com/libp2p/rust-yamux/pull/191).

# 0.13.2

- Bound `Active`'s `pending_frames` to enforce backpressure. 
  See [460baf2](https://github.com/libp2p/rust-yamux/commit/460baf2ccb7d5982b266cb3cb9c0bdf75b4fb779)
  
# 0.13.1

- Fix WASM support using `instant::{Duration, Instant}` instead of `std::time::{Duration, Instant}`.
  See [PR 179](https://github.com/libp2p/rust-yamux/pull/179).

# 0.13.0

- Introduce dynamic stream receive window auto-tuning.
  While low-resourced deployments maintain the benefit of small buffers, high resource deployments eventually end-up with a window of roughly the bandwidth-delay-product (ideal) and are thus able to use the entire available bandwidth.
  See [PR 176](https://github.com/libp2p/rust-yamux/pull/176) for performance results and details on the implementation.
- Remove `WindowUpdateMode`.
  Behavior will always be `WindowUpdateMode::OnRead`, thus enabling flow-control and enforcing backpressure.
  See [PR 178](https://github.com/libp2p/rust-yamux/pull/178).

# 0.12.1

- Deprecate `WindowUpdateMode::OnReceive`.
  It does not enforce flow-control, i.e. breaks backpressure.
  Use `WindowUpdateMode::OnRead` instead.
  See [PR #177](https://github.com/libp2p/rust-yamux/pull/177).

# 0.12.0

- Remove `Control` and `ControlledConnection`.
  Users have to move to the `poll_` functions of `Connection`.
  See [PR #164](https://github.com/libp2p/rust-yamux/pull/164).

- Fix a bug where `Stream`s would not be dropped until their corresponding `Connection` was dropped.
  See [PR #167](https://github.com/libp2p/rust-yamux/pull/167).

Next Steps

  • deployment in versi-net and monitor metrics / CPU impact (extensively test this)

cc @paritytech/networking

lexnv added 6 commits January 23, 2025 11:39
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Copy link
Collaborator

@dmitry-markin dmitry-markin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This simplifies things a lot! 🎉

@lexnv lexnv merged commit 42dbb9c into master Jan 29, 2025
8 checks passed
@lexnv lexnv deleted the lexnv/update-upstream-yamux branch January 29, 2025 12:21
@altonen
Copy link
Contributor

altonen commented Jan 29, 2025

@lexnv

Did enabling auto-tuning improve performance in any significant way?

@lexnv
Copy link
Collaborator Author

lexnv commented Jan 30, 2025

Hi, I've only noticed an improvement in our synthetic benchmarks for the notification protocol on litep2p/with_backpressure/256KB

test notifications_protocol/libp2p/with_backpressure/64KB ... bench:    44405699 ns/iter (+/- 268459)
test notifications_protocol/libp2p/with_backpressure/256KB ... bench:   353299153 ns/iter (+/- 2417926)
 
test notifications_protocol/litep2p/with_backpressure/64KB ... bench:    47341171 ns/iter (+/- 398339)
test notifications_protocol/litep2p/with_backpressure/256KB ... bench:   342768895 ns/iter (+/- 3704768)

Before this PR we saw the following:

1k messages
notifications_protocol/libp2p/with_backpressure/64KB:    44421256 ns/iter (+/- 309026)
notifications_protocol/libp2p/with_backpressure/256KB:   359526580 ns/iter (+/- 2611825)
notifications_protocol/litep2p/with_backpressure/64KB:    46879477 ns/iter (+/- 327248)
notifications_protocol/litep2p/with_backpressure/256KB:   413041795 ns/iter (+/- 4181491)

This reflects a ~20% improvement on this benchmark (before and after). However, I'm not entirely sure about the stability of the data we get out of this. I've not seen anything else in versi-network. CPU-wise, we look to be in the same ballpark. I would still like to run some tests.

The update's main goal was to filter out compatibility issues between libp2p <-> litep2p.
We have discovered on Kusama validators that sometimes libp2p -> litep2p requests would generate a SubstreamUpgrade::Io error, and this was triggering an ugly issue in libp2p where libp2p was not taking into account the protocol timeout

@altonen
Copy link
Contributor

altonen commented Jan 30, 2025

Thanks for the info!

I'm surprised that libp2p is faster, considering I spent a quite a bit of time optimizing the notification paths in litep2p and have a recollection of it being faster, albeit in microbenchmarks. Granted the architectures are very different so could be that this is the price litep2p pays for all the message passing it does or my benchmarks were incorrect. AFAIR the receiver side is quite copy-heavy so if these numbers are unacceptable, I think there are still ways to optimize further.

lexnv added a commit that referenced this pull request Feb 20, 2025
## [0.9.1] - 2025-01-19

This release enhances compatibility between litep2p and libp2p by
backporting the latest Yamux updates. Additionally, it includes various
improvements and fixes to boost the stability and performance of the
WebSocket stream and the multistream-select protocol.

### Changed

- yamux: Switch to upstream implementation while keeping the controller
API ([#320](#320))
- req-resp: Replace SubstreamSet with FuturesStream
([#321](#321))
- cargo: Bring up to date multiple dependencies
([#324](#324))
- build(deps): bump hickory-proto from 0.24.1 to 0.24.3
([#323](#323))
- build(deps): bump openssl from 0.10.66 to 0.10.70
([#322](#322))

### Fixed

- websocket/stream: Fix unexpected EOF on `Poll::Pending` state
poisoning ([#327](#327))
- websocket/stream: Avoid memory allocations on flushing
([#325](#325))
- multistream-select: Enforce `io::error` instead of empty protocols
([#318](#318))
- multistream: Do not wait for negotiation in poll_close
([#319](#319))

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
github-merge-queue bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Feb 20, 2025
This PR updates litep2p to version 0.9.1. The yamux config is entirely
removed to mirror the libp2p yamux upstream version.
While at it, I had to bump indexmap and URL as well. 


## [0.9.1] - 2025-01-19

This release enhances compatibility between litep2p and libp2p by using
the latest Yamux upstream version. Additionally, it includes various
improvements and fixes to boost the stability and performance of the
WebSocket stream and the multistream-select protocol.

### Changed

- yamux: Switch to upstream implementation while keeping the controller
API ([#320](paritytech/litep2p#320))
- req-resp: Replace SubstreamSet with FuturesStream
([#321](paritytech/litep2p#321))
- cargo: Bring up to date multiple dependencies
([#324](paritytech/litep2p#324))
- build(deps): bump hickory-proto from 0.24.1 to 0.24.3
([#323](paritytech/litep2p#323))
- build(deps): bump openssl from 0.10.66 to 0.10.70
([#322](paritytech/litep2p#322))

### Fixed

- websocket/stream: Fix unexpected EOF on `Poll::Pending` state
poisoning ([#327](paritytech/litep2p#327))
- websocket/stream: Avoid memory allocations on flushing
([#325](paritytech/litep2p#325))
- multistream-select: Enforce `io::error` instead of empty protocols
([#318](paritytech/litep2p#318))
- multistream: Do not wait for negotiation in poll_close
([#319](paritytech/litep2p#319))

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
github-actions bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Feb 20, 2025
This PR updates litep2p to version 0.9.1. The yamux config is entirely
removed to mirror the libp2p yamux upstream version.
While at it, I had to bump indexmap and URL as well.

## [0.9.1] - 2025-01-19

This release enhances compatibility between litep2p and libp2p by using
the latest Yamux upstream version. Additionally, it includes various
improvements and fixes to boost the stability and performance of the
WebSocket stream and the multistream-select protocol.

### Changed

- yamux: Switch to upstream implementation while keeping the controller
API ([#320](paritytech/litep2p#320))
- req-resp: Replace SubstreamSet with FuturesStream
([#321](paritytech/litep2p#321))
- cargo: Bring up to date multiple dependencies
([#324](paritytech/litep2p#324))
- build(deps): bump hickory-proto from 0.24.1 to 0.24.3
([#323](paritytech/litep2p#323))
- build(deps): bump openssl from 0.10.66 to 0.10.70
([#322](paritytech/litep2p#322))

### Fixed

- websocket/stream: Fix unexpected EOF on `Poll::Pending` state
poisoning ([#327](paritytech/litep2p#327))
- websocket/stream: Avoid memory allocations on flushing
([#325](paritytech/litep2p#325))
- multistream-select: Enforce `io::error` instead of empty protocols
([#318](paritytech/litep2p#318))
- multistream: Do not wait for negotiation in poll_close
([#319](paritytech/litep2p#319))

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
(cherry picked from commit 42e9de7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants