Skip to content

Commit

Permalink
safekeeper: add initialize_segment variant of `safekeeper_wal_stora…
Browse files Browse the repository at this point in the history
…ge_operation_seconds` (#9691)

## Problem

We don't have a metric capturing the latency of segment initialization.
This can be significant due to fsyncs.

## Summary of changes

Add an `initialize_segment` variant of
`safekeeper_wal_storage_operation_seconds`.
  • Loading branch information
erikgrinaker authored Nov 11, 2024
1 parent 54a1676 commit f63de5f
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 3 deletions.
2 changes: 1 addition & 1 deletion safekeeper/src/metrics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ pub static WRITE_WAL_SECONDS: Lazy<Histogram> = Lazy::new(|| {
pub static FLUSH_WAL_SECONDS: Lazy<Histogram> = Lazy::new(|| {
register_histogram!(
"safekeeper_flush_wal_seconds",
"Seconds spent syncing WAL to a disk",
"Seconds spent syncing WAL to a disk (excluding segment initialization)",
DISK_FSYNC_SECONDS_BUCKETS.to_vec()
)
.expect("Failed to register safekeeper_flush_wal_seconds histogram")
Expand Down
5 changes: 3 additions & 2 deletions safekeeper/src/wal_storage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,9 @@ impl PhysicalStorage {
// Try to open existing partial file
Ok((file, true))
} else {
let _timer = WAL_STORAGE_OPERATION_SECONDS
.with_label_values(&["initialize_segment"])
.start_timer();
// Create and fill new partial file
//
// We're using fdatasync during WAL writing, so file size must not
Expand All @@ -274,8 +277,6 @@ impl PhysicalStorage {
});
file.set_len(self.wal_seg_size as u64).await?;

// Note: this doesn't get into observe_flush_seconds metric. But
// segment init should be separate metric, if any.
if let Err(e) = durable_rename(&tmp_path, &wal_file_partial_path, !self.no_sync).await {
// Probably rename succeeded, but fsync of it failed. Remove
// the file then to avoid using it.
Expand Down

1 comment on commit f63de5f

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5355 tests run: 5132 passed, 1 failed, 222 skipped (full report)


Failures on Postgres 17

  • test_remote_storage_backup_and_restore[False-local_fs]: debug-x86-64
# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_remote_storage_backup_and_restore[debug-pg17-False-local_fs]"
Flaky tests (2)

Postgres 17

Postgres 16

Test coverage report is not available

The comment gets automatically updated with the latest test results
f63de5f at 2024-11-11T18:06:07.368Z :recycle:

Please sign in to comment.