Improve safekeeper to pageserver protocol #5543
Labels
a/performance
Area: relates to performance of the system
a/scalability
Area: related to scalability
c/storage/safekeeper
Component: storage: safekeeper
Motivation
Currently we use a standard postgres way to download WAL –
START_REPLICATION
described in the postgres docs https://www.postgresql.org/docs/current/protocol-replication.htmlThe main problem with it is scalability, because each connection uses a separate TCP connection. TCP connections are not cheap to establish and maintain, and number of concurrent ps<->sk TCP connections is limited by the ports count.
The less improtant problem is protocol extensibility, which can also be improved, but may be ommitted in the first iteration since there is no bottleneck in it.
We already saw issues with network overloading, which manifested in DNS resolution failures. These issues were reproduced around pageserver/safekeeper restarts in a presence of 10k+ active timelines.
DoD
TCP connections between pageservers and safekeepers should be multiplexed, and there should be
O(1)
real network connections between specific pageserver and safekeeper.Implementation ideas
https://neondb.slack.com/archives/C039YKBRZB4/p1683125758173469?thread_ts=1683097406.177109&cid=C039YKBRZB4
Use gRPC. It has HTTP/2 to multiplex many streams within a single TCP connection. It also provides convenient protocol extensibility.
I also think if we will have a gRPC connection between every safekeeper and pageserver, we can use it to bypass broker in delivering timeline updates from safekeepers to pageserver.
The text was updated successfully, but these errors were encountered: