Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GameServer remains "STATE:Creating" if not create serviceaccount #1370

Closed
suecideTech opened this issue Feb 27, 2020 · 9 comments · Fixed by #1400
Closed

GameServer remains "STATE:Creating" if not create serviceaccount #1370

suecideTech opened this issue Feb 27, 2020 · 9 comments · Fixed by #1400
Labels
area/operations Installation, updating, metrics etc area/user-experience Pertaining to developers trying to use Agones, e.g. SDK, installation, etc kind/feature New features for Agones
Milestone

Comments

@suecideTech
Copy link
Contributor

What happened:
I created a GameServer in a namespace that does not create a serviceaccount for agones-sdk.(mistake operation)
As a result, GameServer remains STATE: Creating and I did not understand the cause even when I saw the kubectl describe.

$ kubectl describe gameserver ...

Status:
  Address:
  Node Name:
  Ports:           <nil>
  Reserved Until:  <nil>
  State:           Creating
Events:
  Type    Reason          Age   From                   Message
  ----    ------          ----  ----                   -------
  Normal  PortAllocation  25s   gameserver-controller  Port allocated

What you expected to happen:
I want a message that "error looking up serviceaccount for agones-sdk" displayed in the event of kubectl describe and the STATE becomes STATE: error

How to reproduce it (as minimally and precisely as possible):
Create a GameServer without creating an serviceaccount for agones-sdk.

Anything else we need to know?:
I couldn't find the cause without looking at the agones-controller log.

{"error":"error creating Pod for GameServer simple-udp-mrlql: pods \"simple-udp-mrlql\" is forbidden: error looking up service account test/agones-sdk: serviceaccount \"agones-sdk\" not found","gsKey":"test/simple-udp-mrlql","message":"","queue":"agones.dev.GameServerControllerCreation","severity":"error","source":"*gameservers.Controller","subqueue":"creation","time":"2020-02-27T14:25:43.406369349Z"}

Environment:

  • Agones version: 1.3.0
  • Kubernetes version (use kubectl version): v1.13.12-gke.30 (v1.13.5)
  • Cloud provider or hardware configuration: GKE
  • Install method (yaml/helm): yaml
  • Troubleshooting guide log(s): See above
  • Others: N/A

I would be grateful if you could fixed this.

@suecideTech suecideTech added the kind/bug These are bugs. label Feb 27, 2020
@markmandel
Copy link
Member

This is documented here:
https://agones.dev/site/docs/installation/install-agones/helm/#namespaces

So this is working as intended.

@markmandel markmandel added question I have a question! and removed kind/bug These are bugs. labels Mar 4, 2020
@suecideTech
Copy link
Contributor Author

suecideTech commented Mar 5, 2020

@markmandel Thank you for the information!
Incidentally, I think it is better to fix it as below.

$ git diff
diff --git a/pkg/gameservers/controller.go b/pkg/gameservers/controller.go
index 0a185116..ee1a7cd4 100644
--- a/pkg/gameservers/controller.go
+++ b/pkg/gameservers/controller.go
@@ -571,6 +571,7 @@ func (c *Controller) createGameServerPod(gs *agonesv1.GameServer) (*agonesv1.Gam
                        gs, err = c.moveToErrorState(gs, err.Error())
                        return gs, err
                }
+               gs, err = c.moveToErrorState(gs, err.Error())
                return gs, errors.Wrapf(err, "error creating Pod for GameServer %s", gs.Name)
        }
        c.recorder.Event(gs, corev1.EventTypeNormal, string(gs.Status.State),

@markmandel
Copy link
Member

Unfortunately we can't blanket move to Error state here, because there are very valid temporary reasons why this could fail (i.e. master is temporarily down), that will eventually rectify themselves.

If we can find a way to capture that this is a failure due to RBAC issues though - I agree this would be a nice addition to report it in this way 👍

@markmandel markmandel reopened this Mar 5, 2020
@suecideTech
Copy link
Contributor Author

@markmandel
I understood that it was assumed that the master temporarily down.

Confirmed that "Forbidden" is returned if a failure occurs due to RBAC issues.
https://godoc.org/k8s.io/apimachinery/pkg/api/errors#IsForbidden

That error handling code is as below.

diff --git a/pkg/gameservers/controller.go b/pkg/gameservers/controller.go
index a88d389b..455cc6a6 100644
--- a/pkg/gameservers/controller.go
+++ b/pkg/gameservers/controller.go
@@ -570,6 +570,10 @@ func (c *Controller) createGameServerPod(gs *agonesv1.GameServer) (*agonesv1.GameServer, error) {
		if k8serrors.IsInvalid(err) {
                        c.loggerForGameServer(gs).WithField("pod", pod).Errorf("Pod created is invalid")
                        gs, err = c.moveToErrorState(gs, err.Error())
                        return gs, err
+               } else if k8serrors.IsForbidden(err) {
+                       c.loggerForGameServer(gs).WithField("pod", pod).Errorf("Pod created is forbidden")
+                       gs, err = c.moveToErrorState(gs, err.Error())
+                       return gs, err
                }
                return gs, errors.Wrapf(err, "error creating Pod for GameServer %s", gs.Name)
        }

As a execution result for "describe gameserver", I can confirm the RBAC problem as Events.

Events:
  Type     Reason          Age   From                    Message
  ----     ------          ----  ----                    -------
  Normal   PortAllocation  13s   gameserver-controller   Port allocated
  Warning  Error           13s   gameserver-controller   pods "simple-udp-tndhc" is forbidden: error looking up service account default/agones-sdk: serviceaccount "agones-sdk" not found
  Warning  Unhealthy       13s   missing-pod-controller  Pod is missing

For reference, when handling Invalid error, it is as below.

Events:
  Type     Reason          Age   From                    Message
  ----     ------          ----  ----                    -------
  Normal   PortAllocation  14s   gameserver-controller   Port allocated
  Warning  Error           14s   gameserver-controller   Pod "simple-udp-nbqs7" is invalid: spec.containers[1].name: Required value
  Warning  Unhealthy       14s   missing-pod-controller  Pod is missing

@markmandel
Copy link
Member

Nice! That looks great!.

If you have time, would love it if you submitted a PR with accompanying Unit and e2e tests!

@suecideTech
Copy link
Contributor Author

@markmandel
Thank you for the check.
I will try it.
I have never contributed to OSS and written a PR. for that reason, it will take some time, but I will try.

@markmandel
Copy link
Member

No worries - also feel free to hop on Slack and join the #development channel - happy to talk through some of the details and give you a hand!

@suecideTech
Copy link
Contributor Author

@markmandel
I submitted a PR! Thank you for following me!

Should I close this issue?
(I care about "Label" and "Milestone" of issue.)

@markmandel markmandel added area/operations Installation, updating, metrics etc area/user-experience Pertaining to developers trying to use Agones, e.g. SDK, installation, etc kind/feature New features for Agones and removed question I have a question! labels Mar 11, 2020
@markmandel markmandel added this to the 1.5.0 milestone Mar 11, 2020
@markmandel
Copy link
Member

Updated labels, and closing. Thank you! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/operations Installation, updating, metrics etc area/user-experience Pertaining to developers trying to use Agones, e.g. SDK, installation, etc kind/feature New features for Agones
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants