Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCE health check does not pick up changes to pod readinessProbe #582

Closed
ConradIrwin opened this issue Apr 11, 2017 · 17 comments
Closed

GCE health check does not pick up changes to pod readinessProbe #582

ConradIrwin opened this issue Apr 11, 2017 · 17 comments
Assignees

Comments

@ConradIrwin
Copy link

Importing this issue from kubernetes/kubernetes#43773 as requested.

Kubernetes version (use kubectl version): 1.4.9

Environment:

  • Cloud provider or hardware configuration: GKE
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

What happened:

I created an ingress with type GCE

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: admin-proxy
  labels:
    name: admin-proxy
  annotations:
    kubernetes.io/ingress.allow-http: "false"
spec:
  tls:
    - secretName: ingress-admin-proxy
  backend:
    serviceName: admin-proxy
    servicePort: 80

This set up the health-check on the google cloud backend endpoint to call "/" as documented.

Unfortunately my service doesn't return 200 on "/" and so I added a readinessCheck to the pod as suggested by the documentation.

What you expected to happen:

I expected the health check to be automatically updated.

Instead I had to delete the ingress and re-create it in order for the health-check to update.

How to reproduce it (as minimally and precisely as possible):

  1. Create a deployment with no readiness probe.
  2. Create an ingress pointing to the pods created by that deployment.
  3. Add a readiness probe to the deployment.

Anything else we need to know:

@ConradIrwin
Copy link
Author

Comment by @nicksardo:

There's actually a TODO for deciding what to do in this case https://github.com/kubernetes/ingress/blob/6ed6475e87588494dd29ba5d33bac167ae8acfdd/controllers/gce/healthchecks/healthchecks.go#L71 The controller doesn't want to overwrite handmade changes. We could probably update the settings if they have the default value.

@porridge
Copy link
Member

porridge commented Apr 11, 2017 via email

@nicksardo
Copy link
Contributor

I agree that it's reasonable and would be an easy solution. However, I worry about unsuspecting users who upgrade their master and find their load balancer throwing 502s. Would creating an event on health check update be sufficient? Thoughts from other users?

@ConradIrwin
Copy link
Author

Is there any way that the "ingress controller"? can track whether the value was set automatically?

@nicksardo
Copy link
Contributor

The controller does not have persistent storage. One option would be to convert the health check settings to JSON and store the state in the HealthCheck's optional description field for tracking diffs, but this seems very ugly.

@nicksardo
Copy link
Contributor

cc @ibawt

@porridge
Copy link
Member

porridge commented Apr 11, 2017 via email

@nicksardo
Copy link
Contributor

If the controller sees a health check with settings different from default/readinessProbe, it updates the check then could create a K8s event for each parent ingress. If readiness probe exists, event could say, "Health Check for port 3XXXX updated to readinessProbe settings". If probe doesn't exist, "Health check for port 3XXXX updated to default settings (see ingress docs for defining settings via readinessProbe)".
There's already an event being created at HC creation, but it doesn't seem very useful to me: https://github.com/kubernetes/ingress/blob/987540f8f61f86d07314cd776fe32e150724b48c/controllers/gce/controller/utils.go#L603

@ConradIrwin
Copy link
Author

ConradIrwin commented Apr 11, 2017 via email

@kilianc
Copy link

kilianc commented May 14, 2017

Is there any workaround to make it pick up the changes, even manually?

@kilianc
Copy link

kilianc commented May 14, 2017

To answer my own question and for everybody landing here with the same problem, rotate the name of the service in your rule or backend and the controller will pick up the path correctly. I assume this could be done with 0 downtime creating a new rule and deleting the old one.

@bviolier
Copy link

bviolier commented Jun 8, 2017

We actually encountered an issue yesterday in regards to readinessProbes and health checks on GCE. A while ago we had some difficulties with setting up the health-check correctly through Kubernetes, adding that it doesn't pick up changes, we decided to manually update the health-check on the GCE side, everything worked.
Yesterday we updated our master to 1.6.4, and suddenly all manual set health-check updates were reset to the defaults, probably because Kubernetes (re)set them again.

But, reading this issue, it should never update health-checks. I cannot re-trigger an update now, so it doesn't seem that this functionality has been added. But for sure the master upgrade did trigger this, is that expected behavior?

@bviolier
Copy link

bviolier commented Jun 8, 2017

Another gotcha, when using the example provided, https://github.com/kubernetes/ingress/blob/6ed6475e87588494dd29ba5d33bac167ae8acfdd/controllers/gce/examples/health_checks/health_check_app.yaml, if you remove the ports.containerPort, the GCE ingress health check sets defaults, and can't process the readinessProbe. This might be expected behavior, but at least for us, it wasn't.

All our reset health-checks didn't have a ports.containerPort set.

@nicksardo
Copy link
Contributor

@bviolier

Regarding your last comment, containerPort is a required attribute for defining a port of a container. How would you expect to serve network traffic without defining it?

Regarding your loss of manual health check settings, the latest version of the controller creates a newer-type of health check (compute.HealthCheck instead of the legacy compute.HTTPHealthCheck). This chance was made to make HTTP and HTTPS health checks more manageable by the controller and to make future features of the load balancer easier to consume.

What settings are you trying to override with the readiness probe? Just the path?

@bviolier
Copy link

bviolier commented Jun 8, 2017

@nicksardo Thanks for the information!

I was not aware that the containerPort was a required attribute. But, the pod/services also worked fine without it. The health-check was also working (although after manual editing the path), so traffic could get into the pods through the port specified in the readinessProbe, and without the containerPort.

Makes sense if it created a new type of health-check, thanks for the explanation. I do think more people that manually edited their health checks will run into this issue, but, manually editing health checks is also not the best thing to do ;-)

The most important changes we make with readinessProbe is indeed the path. But occasionally we also change the timeout, interval, and failureThreshold.

@nicksardo
Copy link
Contributor

@bviolier
In case you aren't aware, if your service is using externalTraffic=Global (default), then the other settings aren't very useful. Traffic from the load balancer hits any of the nodes and kube-proxy will route the traffic to a random node. This makes health checking close to useless as you won't have consistent data for a node. readinessProbes/livenessProbes are the way to go for pod health.

@bowei
Copy link
Member

bowei commented Oct 11, 2017

This issue was moved to kubernetes/ingress-gce#39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants