-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
There should be no way of creating branch without specifying LSN, otherwise there is a race #2063
Comments
This problem was mentioned before, see #1371 proper solution requires rather big tweaks in console. E g when compute for the branch is running and you want to create a new branch you need to go to the compute, pick the current lsn here, and pass it to pageserver. I think this needs to be part of a console branching support epic cc @ololobus |
That is racy anyway if compute is under load. Not sure how the API and UI should look like to the end user. Maybe we should always ask for the timestamp to branch? After that we will re-map it to LSN using the pageserver API and finally create a branch Thanks for the reminder |
This just happened again here: https://github.com/neondatabase/neon/runs/7413000700 |
If the WAL arrives at the pageserver slowly, it's possible that the branch is created before all the data on the parent branch have arrived. That results in a failure: test_runner/batch_others/test_tenant_relocation.py:259: in test_tenant_relocation timeline_id_second, current_lsn_second = populate_branch(pg_second, create_table=False, expected_sum=1001000) test_runner/batch_others/test_tenant_relocation.py:133: in populate_branch assert cur.fetchone() == (expected_sum, ) E assert (500500,) == (1001000,) E At index 0 diff: 500500 != 1001000 E Full diff: E - (1001000,) E + (500500,) To fix, specify the LSN to branch at, so that the pageserver will wait for it arrive. See #2063
If the WAL arrives at the pageserver slowly, it's possible that the branch is created before all the data on the parent branch have arrived. That results in a failure: test_runner/batch_others/test_tenant_relocation.py:259: in test_tenant_relocation timeline_id_second, current_lsn_second = populate_branch(pg_second, create_table=False, expected_sum=1001000) test_runner/batch_others/test_tenant_relocation.py:133: in populate_branch assert cur.fetchone() == (expected_sum, ) E assert (500500,) == (1001000,) E At index 0 diff: 500500 != 1001000 E Full diff: E - (1001000,) E + (500500,) To fix, specify the LSN to branch at, so that the pageserver will wait for it arrive. See #2063
Current status:
I think only these things need addressing:
|
This commit cleans up our internal APIs and leaves fewer places where one can forgot to explicitly specify the LSN to branch from. * `neon_local branch` still works without `--ancestor-start-lsn` and the user may bump into a race condition when branching right after SQL query is completed. We should query safekeepers for the most recent `commit_lsn` instead, just like in cloud. * `TimelineCreateRequest` HTTP API for Pageserver now takes requires both `ancestor_timeline_id` and `ancestor_start_lsn` to be simultaneously present or absent. It was allowed to omit start LSN before.
This commit cleans up our internal APIs and leaves fewer places where one can forgot to explicitly specify the LSN to branch from. * `neon_local branch` still works without `--ancestor-start-lsn` and the user may bump into a race condition when branching right after SQL query is completed. We should query safekeepers for the most recent `commit_lsn` instead, just like in cloud. * `TimelineCreateRequest` HTTP API for Pageserver now takes requires both `ancestor_timeline_id` and `ancestor_start_lsn` to be simultaneously present or absent. It was allowed to omit start LSN before.
See #2063 (comment) Also improve control_plane::safekeeper::SafekeeperNode's HTTP API. Previously it was possible for a user to do some SQL queries, branch, and see old data in the branch because pageserver did not get the most recent WAL from safekeepers yet. Now we wait until Pageserver catches up with the most recent data on Safekeepers. An alternative is to `SELECT pg_current_wal_insert_lsn()`, but that requires running branch's compute node in addition to safekeepers. Which in turn invites concurrent writes and worse semantics for branching.
All PRs from the checklist above are created. As this is more of a tech debt issue, closing it now, it requires no further tracking. |
Follow-up of the last failures in #1667 (comment), see also #2062.
Looking at our
neon_local timeline branch
command, it works entirely with pageserver:neon/neon_local/src/main.rs
Lines 690 to 710 in 39d86ed
Moreover, the
test_tenant_relocation
test runs it with--ancestor-branch-name
parameter only, no--ancestor-start-lsn
. So the pageserver decides what LSN to create the branch at. Hence, if the pageserver is not fully caught up with safekeepers, the branch starts a little earlier than it should, truncating parts of the last transaction (effectively erasing it from the second timeline, I guess).For example, it may result in
test_tenant_relocation
failure even before there are relocations: it creates two branches, and the second branch is sometimes created slightly before the first is fully on the pageserver, resulting in divergence: https://app.circleci.com/pipelines/github/neondatabase/neon/7828/workflows/bce7b9dc-e390-4f42-8beb-1a2e81bf2b9d/jobs/80367If so, I argue it's not a flaky test; it's a design bug. We should have precise semantics of what "branching off a named branch" is (probably in terms of
commit_lsn
/flush_lsn
/whatever on Safekeepers), and there should be no way to violate it by misusing the API. For example, there should be no way to call the pageserver branch creation API without specifying the exact parent timeline and the exact LSN in it.I suspect there may be other similar bugs lurking around; the pageserver is probably the source of truth in multiple places (see, e.g., #809). I suspect it would be impossible for our end users to debug unless we somehow expose both safekeepers/pageserver's LSNs, and it would be very hard if we do.
The text was updated successfully, but these errors were encountered: