Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: add no_sync for use in regression tests (2/2) #9678

Merged
merged 2 commits into from
Nov 13, 2024

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Nov 7, 2024

Problem

Followup to #9677 which enables no_sync in tests. This can be merged once the next release has happened.

Summary of changes

  • Always run pageserver with no_sync = true in tests.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@jcsp jcsp added c/storage/pageserver Component: storage: pageserver a/test Area: related to testing a/tech_debt Area: related to tech debt labels Nov 7, 2024
Copy link

github-actions bot commented Nov 7, 2024

5473 tests run: 5241 passed, 2 failed, 230 skipped (full report)


Failures on Postgres 16

# Run all failed tests locally:
scripts/pytest -vv -n $(nproc) -k "test_sharded_ingest[release-pg16-github-actions-selfhosted-1] or test_compaction_l0_memory[release-pg16-github-actions-selfhosted]"
Flaky tests (1)

Postgres 17

Code coverage* (full report)

  • functions: 31.8% (7890 of 24834 functions)
  • lines: 49.5% (62439 of 126258 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
b1cdb42 at 2024-11-12T19:35:34.023Z :recycle:

jcsp added a commit that referenced this pull request Nov 8, 2024
## Problem

In test environments, the `syncfs` that the pageserver does on startup
can take a long time, as other tests running concurrently might have
many gigabytes of dirty pages.

## Summary of changes

- Add a `no_sync` option to the pageserver's config.
- Skip syncfs on startup if this is set
- A subsequent PR (#9678) will
enable this by default in tests. We need to wait until after the next
release to avoid breaking compat tests, which would fail if we set
no_sync & use an old pageserver binary.

Q: Why is this a different mechanism than safekeeper, which as a
--no-sync CLI?
A: Because the way we manage pageservers in neon_local depends on the
pageserver.toml containing the full configuration, whereas safekeepers
have a config file which is neon-local-specific and can drive a CLI
flag.

Q: Why is the option no_sync rather than sync?
A: For boolean configs with a dangerous value, it's preferable to make
"false" the safe option, so that any downstream future config tooling
that might have a "booleans are false by default" behavior (e.g. golang
structs) is safe by default.

Q: Why only skip the syncfs, and not all fsyncs?
A: Skipping all fsyncs would require more code changes, and the most
acute problem isn't fsyncs themselves (these just slow down a running
test), it's the syncfs (which makes a pageserver startup slow as a
result of _other_ tests)
@jcsp jcsp force-pushed the jcsp/pageserver-no-sync-pt2 branch from 9e43b43 to 15fa7a9 Compare November 12, 2024 09:30
@jcsp jcsp requested a review from bayandin November 12, 2024 15:38
@jcsp jcsp marked this pull request as ready for review November 12, 2024 15:40
Copy link
Member

@bayandin bayandin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add run-benchmarks label?
As I understand, we expect these changes to help with them as well?

@jcsp jcsp added the run-benchmarks Indicates to the CI that benchmarks should be run for PR marked with this label label Nov 12, 2024
@jcsp jcsp merged commit 7595d3a into main Nov 13, 2024
82 of 84 checks passed
@jcsp jcsp deleted the jcsp/pageserver-no-sync-pt2 branch November 13, 2024 09:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt a/test Area: related to testing c/storage/pageserver Component: storage: pageserver run-benchmarks Indicates to the CI that benchmarks should be run for PR marked with this label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants