Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Make statistic 'log_sync_latency' only be accounted when disk synchronization happens #11039

Closed
fritshoogland-yugabyte opened this issue Jan 9, 2022 · 1 comment
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@fritshoogland-yugabyte
Copy link

fritshoogland-yugabyte commented Jan 9, 2022

Jira Link: DB-717

Description

Currently, when a WAL entry is written, it executes yb::log::Log::Sync.
In order to optimize WAL write latency, selective invocations of yb::log::Log::Sync result in sync() (or fdatasync() in the future). However, the log_sync_latency statistic is always accounted for, resulting in "empty" invocations, alias invocations that did not execute fsync(). The means the statistic does not reflect the actual log sync amount, only the theoretical one, and the timing includes the timing of not doing the syncing, which means the timing is the average of calling this function not syncing and calling of this function which does performing the syncing. That average has no useful meaning.

@fritshoogland-yugabyte fritshoogland-yugabyte added the area/docdb YugabyteDB core features label Jan 9, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 8, 2022
@rthallamko3
Copy link
Contributor

@basavaraj29 to fix the metric, given that he is working in that area.

basavaraj29 added a commit that referenced this issue Jun 17, 2022
…al disk synchronization happens

Summary: Not all calls to Log::Sync perform the actual disk synchonization (fsync/fdatasync). An actual sync happens either when the unsync data reaches a certain threshold (controlled by gflag `bytes_durable_wal_write_mb`) or when the time interval since the last synced entry exceeds the threshold set by gflag `interval_durable_wal_write_ms`. In the existing implementation, the metric `log_sync_latency` is operated upon each time Log::Sync is called. Hence it is incremented in both cases, when actual fsync happens and also when it doesn't. Since this doesn't give us the actual latency number of performing the disk synchronization, changing it to be called only when the actual fsync operation is performed.

Test Plan: Jenkins

Reviewers: fhoogland, sergei, amitanand, rthallam

Reviewed By: amitanand, rthallam

Subscribers: ybase, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D17579
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

5 participants