-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[aws-rds] Minimize downtime during DBCluster updates #10595
Comments
p.s. I thought it might be a good idea to create an instanceUpdateBehavior argument and change the behavior accordingly. (ROLLING, BULK) |
Hey @hixi-hyi , thanks for opening the issue. This is a very interesting proposal. Pinging @jogold as well , for visibility.
Where does this |
I think a better definition would be Define to
The developer assigns this attribute when creating a DatabaseCluster. |
@hixi-hyi I think I see where you're going with this. So we would add a property to Did I understand your suggestion correctly? |
@skinny85 Yes, You know exactly what I mean. |
I'm glad @hixi-hyi 🙂. Any chance of opening us a PR implementing this? Should only require adding a property here (or perhaps Here's our Contributing guide: https://github.com/aws/aws-cdk/blob/master/CONTRIBUTING.md. Thanks, |
Support defining the instance update behaviour of RDS instances. This allows to switch between bulk (all instances at once) and rolling updates (one instance after another). While bulk updates are faster, they have a higher risk for longer downtimes as all instances might be simultaneously unreachable due to the update. Rolling updates take longer but ensure that all but one instance are not updated and thus downtimes are limited to the (at most two) changes of the primary instance. We keep the current behaviour, namely a bulk update, as default. This implementation follows proposal A by hixi-hyi in issue aws#10595. Fixes aws#10595
@skinny85 I added a PR for this issue some weeks ago. How long does it commonly take for the CDK maintainers to provide feedback to it? Is there anything I can do to speed up the process? |
@mod-enter apologies for the bad experience! Unfortunately, I'm no longer with the CDK team, so I can't review your Pull Request. Perhaps @TheRealAmazonKendra can help with this one? |
No worries, thanks for helping me anyway :) |
Support defining the instance update behaviour of RDS instances. This allows to switch between bulk (all instances at once) and rolling updates (one instance after another). While bulk updates are faster, they have a higher risk for longer downtimes as all instances might be simultaneously unreachable due to the update. Rolling updates take longer but ensure that all but one instance are not updated and thus downtimes are limited to the (at most two) changes of the primary instance. We keep the current behaviour, namely a bulk update, as default. This implementation follows proposal A by hixi-hyi in issue aws#10595.
Support defining the instance update behaviour of RDS instances. This allows to switch between bulk (all instances at once) and rolling updates (one instance after another). While bulk updates are faster, they have a higher risk for longer downtimes as all instances might be simultaneously unreachable due to the update. Rolling updates take longer but ensure that all but one instance are not updated and thus downtimes are limited to the (at most two) changes of the primary instance. We keep the current behaviour, namely a bulk update, as default. This implementation follows proposal A by hixi-hyi in issue aws#10595.
Support defining the instance update behaviour of RDS instances. This allows to switch between bulk (all instances at once) and rolling updates (one instance after another). While bulk updates are faster, they have a higher risk for longer downtimes as all instances might be simultaneously unreachable due to the update. Rolling updates take longer but ensure that all but one instance are not updated and thus downtimes are limited to the (at most two) changes of the primary instance. We keep the current behaviour, namely a bulk update, as default. This implementation follows proposal A by hixi-hyi in issue aws#10595.
Support defining the instance update behaviour of RDS instances. This allows to switch between bulk (all instances at once) and rolling updates (one instance after another). While bulk updates are faster, they have a higher risk for longer downtimes as all instances might be simultaneously unreachable due to the update. Rolling updates take longer but ensure that all but one instance are not updated and thus downtimes are limited to the (at most two) changes of the primary instance. We keep the current behaviour, namely a bulk update, as default. This implementation follows proposal A by @hixi-hyi in issue #10595. Fixes #10595
|
Minimize downtime during DB Cluster updates
Current Status
The CfnDBInstance of DBCluster is currently loosely coupled.
That is, if there are multiple CfnDBInstances, Instance updates will occur at the same time because there is no dependency on them on Cloudformation.
Therefore, the cluster will not be available until the DBInstance update is complete.
Proposal
Adds a dependency to CfnDBInstance.
As a result, one by one, RollingUpdate will be performed, and the only downtime will be the timing of the primary switch.
In other words, when there are two instances, it will take only two failover times to update. (A)
If we can create Dependency dynamically, it will take only a one-time failover time to update. (B)
I think primary failover times are faster than Instance updates. So I think it would be useful to include this feature.
However, the update time for Stack and the maintenance time for offline updates will increase.
What do you think about this proposal?
I'd like to hear your opinion.
Proposal Solution (A)
aws-cdk/packages/@aws-cdk/aws-rds/lib/cluster.ts
Line 734 in d95af00
Add
instance.node.addDependency(previous_instance);
Proposal Solution (B)
I think we need to use aws-sdk to determine if the current Instance is primary or replica, but I haven't thought about it in detail.
This is a 🚀 Feature Request
The text was updated successfully, but these errors were encountered: