Skip to content

Commit

Permalink
safekeeper: don't flush control file on WAL ingest path (#9698)
Browse files Browse the repository at this point in the history
## Problem

The control file is flushed on the WAL ingest path when the commit LSN
advances by one segment, to bound the amount of recovery work in case of
a crash. This involves 3 additional fsyncs, which can have a significant
impact on WAL ingest throughput. This is to some extent mitigated by
`AppendResponse` not being emitted on segment bound flushes, since this
will prevent commit LSN advancement, which will be addressed separately.

## Summary of changes

Don't flush the control file on the WAL ingest path at all. Instead,
leave that responsibility to the timeline manager, but ask it to flush
eagerly if the control file lags the in-memory commit LSN by more than
one segment. This should not cause more than `REFRESH_INTERVAL` (300 ms)
additional latency before flushing the control file, which is
negligible.
  • Loading branch information
erikgrinaker authored Nov 12, 2024
1 parent cc8029c commit 6b19867
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 11 deletions.
5 changes: 5 additions & 0 deletions libs/utils/src/lsn.rs
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,11 @@ impl Lsn {
self.0.checked_sub(other).map(Lsn)
}

/// Subtract a number, saturating at numeric bounds instead of overflowing.
pub fn saturating_sub<T: Into<u64>>(self, other: T) -> Lsn {
Lsn(self.0.saturating_sub(other.into()))
}

/// Subtract a number, returning the difference as i128 to avoid overflow.
pub fn widening_sub<T: Into<u64>>(self, other: T) -> i128 {
let other: u64 = other.into();
Expand Down
12 changes: 2 additions & 10 deletions safekeeper/src/safekeeper.rs
Original file line number Diff line number Diff line change
Expand Up @@ -979,7 +979,8 @@ where
self.wal_store.flush_wal().await?;
}

// Update commit_lsn.
// Update commit_lsn. It will be flushed to the control file regularly by the timeline
// manager, off of the WAL ingest hot path.
if msg.h.commit_lsn != Lsn(0) {
self.update_commit_lsn(msg.h.commit_lsn).await?;
}
Expand All @@ -992,15 +993,6 @@ where
self.state.inmem.peer_horizon_lsn =
max(self.state.inmem.peer_horizon_lsn, msg.h.truncate_lsn);

// Update truncate and commit LSN in control file.
// To avoid negative impact on performance of extra fsync, do it only
// when commit_lsn delta exceeds WAL segment size.
if self.state.commit_lsn + (self.state.server.wal_seg_size as u64)
< self.state.inmem.commit_lsn
{
self.state.flush().await?;
}

trace!(
"processed AppendRequest of len {}, begin_lsn={}, end_lsn={:?}, commit_lsn={:?}, truncate_lsn={:?}, flushed={:?}",
msg.wal_data.len(),
Expand Down
7 changes: 6 additions & 1 deletion safekeeper/src/timeline_manager.rs
Original file line number Diff line number Diff line change
Expand Up @@ -515,7 +515,12 @@ impl Manager {
return;
}

if state.cfile_last_persist_at.elapsed() > self.conf.control_file_save_interval {
if state.cfile_last_persist_at.elapsed() > self.conf.control_file_save_interval
// If the control file's commit_lsn lags more than one segment behind the current
// commit_lsn, flush immediately to limit recovery time in case of a crash. We don't do
// this on the WAL ingest hot path since it incurs fsync latency.
|| state.commit_lsn.saturating_sub(state.cfile_commit_lsn).0 >= self.wal_seg_size as u64
{
let mut write_guard = self.tli.write_shared_state().await;
// it should be done in the background because it blocks manager task, but flush() should
// be fast enough not to be a problem now
Expand Down

1 comment on commit 6b19867

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5473 tests run: 5240 passed, 3 failed, 230 skipped (full report)


Failures on Postgres 16

  • test_sharded_ingest[github-actions-selfhosted-1]: release-x86-64
  • test_storage_controller_many_tenants[github-actions-selfhosted]: release-x86-64
  • test_compaction_l0_memory[github-actions-selfhosted]: release-x86-64
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_sharded_ingest[release-pg16-github-actions-selfhosted-1] or test_storage_controller_many_tenants[release-pg16-github-actions-selfhosted] or test_compaction_l0_memory[release-pg16-github-actions-selfhosted]"
Flaky tests (4)

Postgres 17

Postgres 15

Code coverage* (full report)

  • functions: 31.8% (7883 of 24826 functions)
  • lines: 49.5% (62418 of 126222 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
6b19867 at 2024-11-12T17:12:25.106Z :recycle:

Please sign in to comment.