Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-5827] Set all nodes to synchronous replicas #672

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

dragomirp
Copy link
Contributor

@dragomirp dragomirp commented Nov 14, 2024

Set all replicas to sync

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.38%. Comparing base (2f8eb48) to head (b675d6f).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #672   +/-   ##
=======================================
  Coverage   71.38%   71.38%           
=======================================
  Files          15       15           
  Lines        3253     3254    +1     
  Branches      480      479    -1     
=======================================
+ Hits         2322     2323    +1     
  Misses        812      812           
  Partials      119      119           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dragomirp dragomirp marked this pull request as ready for review November 15, 2024 20:40
@@ -846,7 +846,7 @@ def update_synchronous_node_count(self, units: int | None = None) -> None:
with attempt:
r = requests.patch(
f"{self._patroni_url}/config",
json={"synchronous_node_count": units // 2},
json={"synchronous_node_count": units - 1},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REST calls should also use the new value.

@dragomirp dragomirp requested review from a team, taurus-forever, marceloneppel and lucasgameiroborges and removed request for a team November 15, 2024 21:09
Copy link
Member

@lucasgameiroborges lucasgameiroborges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, but why are we setting all nodes to be sync replicas? From the mentioned ticket it should come from a config value?

@dragomirp
Copy link
Contributor Author

Code LGTM, but why are we setting all nodes to be sync replicas? From the mentioned ticket it should come from a config value?

What was discussed was to switch the default behaviour to all sync and add a config for controlling the amount of syncs. The config needs a spec. I'll comment on the ticket to clarify.

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Testing it, it's working fine in almost all scenarios.

The scenario where it's not working yet is the upgrade:

juju deploy postgresql --channel 14/stable -n 3

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, one sync-standby and one replica, as expected.

juju run postgresql/leader pre-upgrade-check

juju refresh --path ./*.charm postgresql

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, one sync-standby and one replica, but we should have one leader and two sync-standby only.

juju add-unit postgresql -n 1

# Wait for the units to settle down, then:
curl UNIT-ZERO-IP:8008/cluster | jq # There is one leader, and three sync-standby, as expected.

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, need some time to play it without rush.
Let's merge it after release the current PostgreSQL VM to stable.

@dragomirp
Copy link
Contributor Author

The scenario where it's not working yet is the upgrade:

5082317 should fix the upgrade. Not adding an assertion to the integration tests because functionality will drift when the changes land on edge and stable.

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario where it's not working yet is the upgrade:

5082317 should fix the upgrade. Not adding an assertion to the integration tests because functionality will drift when the changes land on edge and stable.

Thanks, Dragomir! LGTM!

@taurus-forever
Copy link
Contributor

@dragomirp please do not forget to add config option synchronous_node_count as agreed in spec (with detailed description, the same as in spec).

@taurus-forever
Copy link
Contributor

@dragomirp are we planning to include this PR the config option https://docs.google.com/document/d/1-a79fBYZPW2TUsKCGNyKjVWgLQefb9T7AiyV4QqW-y4/edit?tab=t.0 or it will be the separate PR?

@dragomirp
Copy link
Contributor Author

@dragomirp are we planning to include this PR the config option https://docs.google.com/document/d/1-a79fBYZPW2TUsKCGNyKjVWgLQefb9T7AiyV4QqW-y4/edit?tab=t.0 or it will be the separate PR?

I'll create a separate PR to this branch so all is in one place and reviewed, like I did for k8s.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants