Do not remove replica field if set to zero #1305

pooneh-m · 2020-01-30T00:34:20Z

Starting from Agones v1.2 the replica field was made optional. The API is also tagged with omitempty, meaning for JSON serialization the field is omitted if set to zero. omitempty tag makes the API backward incompatible with Agones v1.1.

markmandel · 2020-01-30T00:37:53Z

If we make this change, how do we update Replicas, without changing the replica count?

This is particularly important for Fleets that are controlled by Autoscalers.

agones-bot · 2020-01-30T00:43:31Z

Build Failed 😱

Build Id: 2d58b30d-5ff5-4312-84a8-c19e0a388b72

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

pooneh-m · 2020-01-30T01:41:13Z

this change, how do we update Replicas, without changing the replica count?

This is particularly important for Fleets that are controlled by Autoscalers.

Oh, thanks for reminding me. I forgot the use-case. I'll close the PR then.

markmandel · 2020-01-30T17:28:01Z

I'd like to reopen this for discussion, because I was thinking about this more, and I think adding omitemtpy might have been a mistake:

If you want to make a change through the Go API without the replica field, you have PATCH (which is really what it is for)
In the case of the autoscaler, if it changes the replicas field before you update, the K8s api will refuse to update the GameServer resource until you are on the same generation anyway
Having omitempty makes it super hard to use the go library to scale a fleet to 0 without using a Patch command.

So I'm going to reopen this, mainly as well to see if all the e2e tests pass, and to hear if anyone has additional thoughts and opinions.

Sorry it took me so long to come around on this. Sometimes it takes me a while. 😞

markmandel · 2020-01-30T17:32:49Z

It might be worth noting that there is a UpdateScale interface as well for Fleets, so you could use that to scale to 0.

@aLekSer do you have thoughts on this?

aLekSer · 2020-01-30T17:35:00Z

I will test this steps again on this PR changes RollingUpdate shuts down all servers at once
A E2E test was added for this TestFleetAutoScalerRollingUpdate at https://github.com/googleforgames/agones/pull/1168/files

agones-bot · 2020-01-30T17:46:50Z

Build Succeeded 👏

Build Id: d93a9a6c-364b-425c-9d69-e52966914a00

The following development artifacts have been built, and will exist for the next 30 days:

image: gcr.io/agones-images/agones-controller:1.4.0-56dc031
image: gcr.io/agones-images/agones-sdk:1.4.0-56dc031
image: gcr.io/agones-images/agones-ping:1.4.0-56dc031
Linux C++ SDK (build): agonessdk-1.4.0-56dc031-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.4.0-56dc031.zip

A preview of the website (the last 30 builds are retained):

https://56dc031-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1305/head:pr_1305 && git checkout pr_1305
helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.4.0-56dc031

markmandel · 2020-01-30T18:12:28Z

I will test this steps again on this PR changes RollingUpdate shuts down all servers at once
A E2E test was added for this TestFleetAutoScalerRollingUpdate at https://github.com/googleforgames/agones/pull/1168/files

Thanks! Looks like e2e tests have passed.

This shouldn't affect the kubectl test you did previously, as it doesn't route through our go code - it talks directly to the Kubernetes API. That being said, let's still double check to make sure.

If the kubectl test passes for you @aLekSer , I recommend we merge this PR.

aLekSer · 2020-01-30T20:59:50Z

RollingUpdate should be working, I will follow the docs tomorrow to see that there are now regressions, there are couple of options how to change a Fleet.
As for now I have tested on this change with this Fleet yaml:

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: fleet-example
spec:
  scheduling: Packed
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  template:
    metadata:
      labels:
        foo: ba2r
    spec:
      ports:
      - name: default
        portPolicy: Dynamic
        containerPort: 26000
      health:
        initialDelaySeconds: 30
        periodSeconds: 60
      template:
        spec:
          containers:
          - name: simple-udp
            image: gcr.io/agones-images/udp-server:0.16
---
apiVersion: "autoscaling.agones.dev/v1"
kind: FleetAutoscaler
metadata:
  name: server-autoscaler
spec:
  fleetName: fleet-example
  policy:
    type: Buffer
    buffer:
      bufferSize: 5
      minReplicas: 5
      maxReplicas: 10

If I change the label and do kubectl apply -f ./f.yaml again, rollingUpdate keeps working.

markmandel · 2020-02-03T20:17:13Z

@aLekSer gentle bump - have you had a chance to test this out yet?

aLekSer · 2020-02-04T12:48:01Z

Hello, partly yes, will update ASAP.

aLekSer · 2020-02-05T09:37:09Z

Regarding the docs, I have tested with this PR kubectl edit :

When Fleet update contains only changes to the replicas parameter, then new GameServers will be created/deleted straight away, which means in that case maxSurge and maxUnavailable parameters for a RollingUpdate will not be used.

And the opposite if we have replicas and other field change than RollingUpdate is still working.

However when I use fleet with fleetautoscaler than we break the doc statement:

1. With `kubectl apply`: you should omit replicas parameter in a Fleet Spec before re-applying the Fleet configuration.

https://gist.github.com/aLekSer/f0da2fb8a5a43d170d0dd711bfbd6f33
I did created a fleet with maxSurge one and replicas: 5, FAS with buffer of 5. Allocated one GS and then update the label scaling from 1 to 5 was in one second time, it should be step by step: remove one - add one:

5m41s       Normal    ScalingGameServerSet    Fleet             Scaling active GameServerSet flat-example-2rpmh from 5 to 6                                                                              
2m29s       Normal    ScalingGameServerSet    Fleet             Scaling inactive GameServerSet flat-example-2rpmh from 6 to 5                                                                            
2m26s       Normal    ScalingGameServerSet    Fleet             Scaling inactive GameServerSet flat-example-2rpmh from 5 to 4                                                                            
2m20s       Normal    ScalingGameServerSet    Fleet             Scaling inactive GameServerSet flat-example-2rpmh from 4 to 3                                                                            
2m11s       Normal    CreatingGameServerSet   Fleet             Created GameServerSet flat-example-t7bv5                                                                                                 
2m10s       Normal    ScalingGameServerSet    Fleet             Scaling inactive GameServerSet flat-example-2rpmh from 3 to 2                                                                            
2m10s       Normal    ScalingGameServerSet    Fleet             Scaling active GameServerSet flat-example-t7bv5 from 1 to 2                                                                              
2m10s       Normal    ScalingGameServerSet    Fleet             Scaling active GameServerSet flat-example-t7bv5 from 2 to 3                                                                              
2m10s       Normal    ScalingGameServerSet    Fleet             Scaling active GameServerSet flat-example-t7bv5 from 3 to 4                                                                              
2m9s        Normal    ScalingGameServerSet    Fleet             Scaling active GameServerSet flat-example-t7bv5 from 4 to 5

aLekSer · 2020-02-05T09:57:09Z

Tested this on version 1.3.0 - same issue applies to it, so docs was inaccurate.
https://gist.github.com/aLekSer/f0da2fb8a5a43d170d0dd711bfbd6f33
Just as a side note. Replicaset does contain omitempty however it is using pointer:

// ReplicaSetSpec is the specification of a ReplicaSet.
type ReplicaSetSpec struct {
	// Replicas is the number of desired replicas.
	// This is a pointer to distinguish between explicit zero and unspecified.
	// Defaults to 1.
	// More info: https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller/#what-is-a-replicationcontroller
	// +optional
	Replicas *int32 `json:"replicas,omitempty" protobuf:"varint,1,opt,name=replicas"`

As well as DeploymentSpec
https://github.com/kubernetes/kubernetes/blob/7e8f31b31318104f0288350652eaca0b1c84d4fc/staging/src/k8s.io/api/apps/v1/types.go#L274

aLekSer

Tested this on version 1.3.0 - same issue applies to it, so docs was inaccurate.
https://gist.github.com/aLekSer/f0da2fb8a5a43d170d0dd711bfbd6f33
Overall I can approve this PR as we don't have regressions.

Just as a side note. Replicaset does contain omitempty however it is using pointer:

// ReplicaSetSpec is the specification of a ReplicaSet.
type ReplicaSetSpec struct {
	// Replicas is the number of desired replicas.
	// This is a pointer to distinguish between explicit zero and unspecified.
	// Defaults to 1.
	// More info: https://kubernetes.io/docs/concepts/workloads/controllers/replicationcontroller/#what-is-a-replicationcontroller
	// +optional
	Replicas *int32 `json:"replicas,omitempty" protobuf:"varint,1,opt,name=replicas"`

As well as DeploymentSpec
https://github.com/kubernetes/kubernetes/blob/7e8f31b31318104f0288350652eaca0b1c84d4fc/staging/src/k8s.io/api/apps/v1/types.go#L274

Mark @markmandel do we need to change Replicas to pointer at some point?

markmandel · 2020-02-05T15:03:15Z

I did created a fleet with maxSurge one and replicas: 5, FAS with buffer of 5. Allocated one GS and then update the label scaling from 1 to 5 was in one second time, it should be step by step: remove one - add one:

Sounds like we should create a bug to track this separately.

Mark @markmandel do we need to change Replicas to pointer at some point?

Using a pointer seems like an elegant solution, in that while it does change the API, it won't actually break functionality (wherein the previous fix did break functionality, because you couldn't scale to 0).

That being said, we need to work out what the root cause of the issue is here. kubectl apply doesn't go through our go code, so there's something in our controller(s) that is causing this, and see if that change will solve the problem - otherwise there is probably no point in making the change at this stage.

markmandel

Let's move forward with this, and track the other issue separately 👍

google-oss-robot · 2020-02-05T15:05:13Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aLekSer, markmandel, pooneh-m

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [markmandel,pooneh-m]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

google-oss-robot · 2020-02-05T17:07:23Z

New changes are detected. LGTM label has been removed.

agones-bot · 2020-02-05T17:27:40Z

Build Succeeded 👏

Build Id: 29078eed-6068-4332-82ca-8ad11d7a2fb5

The following development artifacts have been built, and will exist for the next 30 days:

image: gcr.io/agones-images/agones-controller:1.4.0-a1df680
image: gcr.io/agones-images/agones-sdk:1.4.0-a1df680
image: gcr.io/agones-images/agones-ping:1.4.0-a1df680
Linux C++ SDK (build): agonessdk-1.4.0-a1df680-linux-arch_64.tar.gz
SDK Server: agonessdk-server-1.4.0-a1df680.zip

A preview of the website (the last 30 builds are retained):

https://a1df680-dot-preview-dot-agones-images.appspot.com/

To install this version:

git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1305/head:pr_1305 && git checkout pr_1305
helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.4.0-a1df680

Starting from Agones v1.2 the replica field was made optional. The API is also tagged with omitempty, meaning for JSON serialization the field is omitted if set to zero. omitempty tag makes the API backward incompatible with Agones v1.1. Co-authored-by: Mark Mandel <mark.mandel@gmail.com>

Do not remove replica field if set to zero

56dc031

Starting from Agones v1.2 the replica field was made optional. The API is also tagged with omitempty, meaning for JSON serialization the field is omitted if set to zero. omitempty tag makes the API backward incompatible with Agones v1.1.

google-oss-robot requested review from EricFortin and markmandel January 30, 2020 00:34

google-oss-robot added approved size/XS labels Jan 30, 2020

pooneh-m closed this Jan 30, 2020

markmandel reopened this Jan 30, 2020

roberthbailey assigned markmandel and aLekSer Jan 31, 2020

roberthbailey requested review from aLekSer and removed request for EricFortin January 31, 2020 23:10

aLekSer approved these changes Feb 5, 2020

View reviewed changes

google-oss-robot added the lgtm label Feb 5, 2020

markmandel added the kind/cleanup Refactoring code, fixing up documentation, etc label Feb 5, 2020

markmandel added this to the 1.4.0 milestone Feb 5, 2020

markmandel approved these changes Feb 5, 2020

View reviewed changes

Merge branch 'master' into pooneh-m-patch-1

a1df680

google-oss-robot removed the lgtm label Feb 5, 2020

markmandel merged commit b9d123d into master Feb 5, 2020

markmandel deleted the pooneh-m-patch-1 branch February 5, 2020 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not remove replica field if set to zero #1305

Do not remove replica field if set to zero #1305

pooneh-m commented Jan 30, 2020

markmandel commented Jan 30, 2020

agones-bot commented Jan 30, 2020

pooneh-m commented Jan 30, 2020

markmandel commented Jan 30, 2020

markmandel commented Jan 30, 2020

aLekSer commented Jan 30, 2020 •

edited

Loading

agones-bot commented Jan 30, 2020

markmandel commented Jan 30, 2020

aLekSer commented Jan 30, 2020 •

edited

Loading

markmandel commented Feb 3, 2020

aLekSer commented Feb 4, 2020

aLekSer commented Feb 5, 2020 •

edited

Loading

aLekSer commented Feb 5, 2020 •

edited

Loading

aLekSer left a comment •

edited

Loading

markmandel commented Feb 5, 2020 •

edited

Loading

markmandel left a comment

google-oss-robot commented Feb 5, 2020

google-oss-robot commented Feb 5, 2020

agones-bot commented Feb 5, 2020

Do not remove replica field if set to zero #1305

Do not remove replica field if set to zero #1305

Conversation

pooneh-m commented Jan 30, 2020

markmandel commented Jan 30, 2020

agones-bot commented Jan 30, 2020

pooneh-m commented Jan 30, 2020

markmandel commented Jan 30, 2020

markmandel commented Jan 30, 2020

aLekSer commented Jan 30, 2020 • edited Loading

agones-bot commented Jan 30, 2020

markmandel commented Jan 30, 2020

aLekSer commented Jan 30, 2020 • edited Loading

markmandel commented Feb 3, 2020

aLekSer commented Feb 4, 2020

aLekSer commented Feb 5, 2020 • edited Loading

aLekSer commented Feb 5, 2020 • edited Loading

aLekSer left a comment • edited Loading

Choose a reason for hiding this comment

markmandel commented Feb 5, 2020 • edited Loading

markmandel left a comment

Choose a reason for hiding this comment

google-oss-robot commented Feb 5, 2020

google-oss-robot commented Feb 5, 2020

agones-bot commented Feb 5, 2020

aLekSer commented Jan 30, 2020 •

edited

Loading

aLekSer commented Jan 30, 2020 •

edited

Loading

aLekSer commented Feb 5, 2020 •

edited

Loading

aLekSer commented Feb 5, 2020 •

edited

Loading

aLekSer left a comment •

edited

Loading

markmandel commented Feb 5, 2020 •

edited

Loading