What happens when deploying a 'good' build when the service is already fully scaled up?

In this test we went from application version ABC to XYZ in the stack:

ScalingAsgRollingUpdate (CFN stack playground-CODE-scaling-asg-rolling-update)

The main aim of this test was to establish whether deploying whilst a service is fully scaled up works as desired.

Highlights

The current implementation leads to a temporary scale down during deployment.

The number of instances that the service will be scaled down by is: maximumCapacity - minimumCapacity, so this could be a significant drop for a service with a high maximum capacity.

Timeline

Build number 98 was deployed (in order to start the test from a clean state - running build ABC)
The service was scaled up by repeatedly invoking our scale-out script.
The service scales up to 9 instances (from 3).
Build number 100 was deployed (which updates to build XYZ)
The CFN stack playground-CODE-scaling-asg-rolling-update started updating:

First:

Rolling update initiated. Terminating 9 obsolete instance(s) in batches of 6, while keeping at least 3 instance(s) in service. Waiting on resource signals with a timeout of PT5M when new instances are added to the autoscaling group.

Then 6 instances are terminated and 6 new ones are launched:

Terminating instance(s) [i-0333b7c2687c1ab46,i-04427ad5d2e5aa426,i-009b2c94810830dc5,i-0357d971d597edbbc,i-087ddecad98eebd05,i-047dbb0efa5bf5123]; replacing with 6 new instance(s).

At this point we are under-provisioned by 6 instances.
6 SUCCESS signals are received. At this point we are provisioned correctly again.
3 more instances are terminated and 3 more are launched:

Terminating instance(s) [i-07b18ed78618ef26a,i-0f94470f722e91778,i-0dc27b65fc7911afe]; replacing with 3 new instance(s).

At this point we are under-provisioned by 3 instances.
3 SUCCESS signals are received and the deployment completes. At this point we are provisioned correctly again.

Unfortunately this means that the deployment causes us to temporarily run with 3 instances serving traffic (and later 6 instances) when we really need 9 to cope with the load (see healthy hosts panel).

Full details can be seen via the dashboard.

Potential Mitigations

See the potential mitigations described in the partially-scaled scenario, which also apply here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

healthy-to-healthy-fully-scaled.md

healthy-to-healthy-fully-scaled.md

What happens when deploying a 'good' build when the service is already fully scaled up?

Highlights

Timeline

Potential Mitigations

Files

healthy-to-healthy-fully-scaled.md

Latest commit

History

healthy-to-healthy-fully-scaled.md

File metadata and controls

What happens when deploying a 'good' build when the service is already fully scaled up?

Highlights

Timeline

Potential Mitigations