Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schemeboard: pass describe-result as an opaque payload #2391

Conversation

ijon
Copy link
Collaborator

@ijon ijon commented Mar 1, 2024

Cherry-pick 3819aed from main (#2083).

Changelog entry

Make schemeboard replicas consume less CPU, especially when processing rapid updates for tables with huge amount of partitions.

Changelog category

  • Improvement

Additional information

Schema information on a path exist in the form of DescribeSchemeResult object: schemeshard generates those objects and publishes them to the schemeboard, schemeboard notifies scheme-caches on the nodes about path info changes. So schemeshard generates DescribeSchemeResult, scheme-cache serves DescribeSchemeResult to the consumers. But schemeboard components in-between do not require the full content of a TEvDescribeSchemeResult to operate efficiently.

This update enables the schemeboard to transmit DescribeSchemeResult through as an opaque payload rather than as a fully detailed protobuf object. Thus reducing the unnecessary memory management and serialization/deserialization overhead.

Cherry-pick 3819aed from main (ydb-platform#2083).

Change type of `{TEvUpdate,TEvNotify}.DescribeSchemeResult` from transparent
`TEvDescribeSchemeResult` to opaque `bytes` and support that throughout
Populator, Replica, Subscriber actors.

Properly typed TEvDescribeSchemeResult induce additional overhead to
automatically serialize and deserialize this message when transfering over
the wire.
This performance cost is usually either negligible or imperceptible.
But in specific situations, particularly when rapidly updating partitioning
information for tables with huge number of shards, this overhead could lead
to significant issues. Schemeboard replicas could get overloaded and become
unresponsive to further requests. This is problematic, especially considering
the schemeboard subsystem's critical role in servicing all databases within
a cluster, making it a SPOF.

The core realization is that the schemeboard components do not require
the full content of a TEvDescribeSchemeResult message to operate efficiently.
Instead, only a limited set of fields (path, path-id, version and info about
subdomain/database) is required for processing.
And a whole TEvDescribeSchemeResult could be passed through as an opaque payload.

Type change from TEvDescribeSchemeResult to bytes without changing field number
is a safe move. Actual value of the field remains unchanged at the wire
protocol level.
Thus, older implementations will interpret the payload as
a TEvDescribeSchemeResult message and proceed with deserialization as usual.
And newer implementations will recognize the data as a binary blob and will
deserialize it explicitly only when necessary.

KIKIMR-14948
@ijon ijon requested a review from a team as a code owner March 1, 2024 15:15
Copy link

github-actions bot commented Mar 1, 2024

2024-03-01 15:16:28 UTC Pre-commit check for f9a3938 has started.
2024-03-01 15:16:31 UTC Build linux-x86_64-release-asan is running...
🟢 2024-03-01 15:40:57 UTC Build successful.
2024-03-01 15:41:10 UTC Tests are running...
🔴 2024-03-01 17:19:25 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
16045 15938 0 23 56 28

Copy link

github-actions bot commented Mar 1, 2024

2024-03-01 15:16:29 UTC Pre-commit check for f9a3938 has started.
2024-03-01 15:16:32 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-03-01 15:43:56 UTC Build successful.
2024-03-01 15:44:11 UTC Tests are running...
🔴 2024-03-01 17:08:32 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
60362 50954 0 1 9340 67

@ijon ijon merged commit a607285 into ydb-platform:stable-24-1 Mar 5, 2024
2 of 4 checks passed
@ijon ijon deleted the merge/24-1/schemeboard-opaque-describeresult branch March 5, 2024 15:34
@mregrock mregrock mentioned this pull request May 15, 2024
This was referenced Jun 7, 2024
@CyberROFL CyberROFL mentioned this pull request Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants