Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove health checks controller and use endpoints controller for health checks #472

Merged
merged 9 commits into from
Apr 8, 2021

Conversation

kschoche
Copy link
Contributor

@kschoche kschoche commented Apr 1, 2021

Changes proposed in this PR:

  • Remove existing healthchecks controller code, tests, and plumbing in connect-inject command.
  • Create+Update the TTL health check alongside service registration in the endpoint controller.
  • Update ACL rules for the webhook

How I've tested this PR:
Unit tests for rule changes have been added.
Unit tests have been added which assert that health checks are added on creation of the services.
Unit tests have been added which assert that health checks update their state when the service transitions from healthy/unhealthy in the endpoints controller. (This could be expanded I think)
Run consul-helm acceptance tests which confirm health checks function still.

Manually deployed the consul-k8s image to a cluster (with acls enabled). deployed an app with readiness probes and validated that modifying the k8s readiness probes to pass/fail results in the consul service updating it's health state as "critical" or "passing" and that it shows up in the UI.

How I expect reviewers to test this PR:
Code review, unit tests, manually deploy.
Reminder to reviewers that you'll need to deploy a k8s service in front of your app such as :

 cat svc.yaml
apiVersion: v1
kind: Service
metadata:
  name: counting
spec:
  selector:
    app: bob
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 9001
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: counting
---
apiVersion: v1
kind: Pod
metadata:
  name: counting
  labels:
    app: bob
    consul: "connect-inject"
  annotations:
    "consul.hashicorp.com/connect-inject": "true"
spec:
  containers:
  - name: counting
    image: hashicorp/counting-service:0.0.2
    ports:
    - containerPort: 9001
      name: http
    readinessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      periodSeconds: 5
  serviceAccountName: counting

Checklist:

  • Tests added
  • CHANGELOG entry added (HashiCorp engineers only, community PRs should not add a changelog entry)

@kschoche kschoche added type/enhancement New feature or request theme/health-checks About Consul health checking theme/tproxy Items related to transparent proxy labels Apr 1, 2021
@kschoche kschoche requested a review from a team April 1, 2021 21:23
@kschoche kschoche self-assigned this Apr 1, 2021
@kschoche kschoche requested review from lkysow and thisisnotashwin and removed request for a team April 1, 2021 21:23
Copy link
Contributor

@thisisnotashwin thisisnotashwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks really good!! i think this is the right approach. let me know when you would like a thorough review!! 😄

CONSUL_VERSION: 1.9.0-rc1 # Consul's OSS version to use in tests
CONSUL_ENT_VERSION: 1.9.0+ent-rc1 # Consul's enterprise version to use in tests
CONSUL_VERSION: 1.9.4 # Consul's OSS version to use in tests
CONSUL_ENT_VERSION: 1.9.4+ent # Consul's enterprise version to use in tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just got pushed into master and will go away when we rebase feature-tproxy next.

@@ -208,13 +226,26 @@ func (r *EndpointsController) createServiceRegistrations(pod corev1.Pod, service
tags = append(tags, strings.Split(raw, ",")...)
}

status, _, err := getReadyStatusAndReason(&pod)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont use the "reason" field here because setting the Message field in the check on registration causes the Notes/Output field to get jumbled the next time we issue an update which occurs a couple ms later. It's fine and only has the effect of there being no Notes field until the UpdateTTL() is issues a couple ms later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good. What exactly gets jumbled here though just for my own understanding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Notes field gets set and you cannot override it with later update.
So you end up with:

 Notes:  "k8s probe failed"
 Output: "k8s probe healthy"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good note and I think it would be good as a comment in the code since I'm sure readers would wonder!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'll add a comment about it!

@kschoche kschoche marked this pull request as ready for review April 5, 2021 17:19
# This is the 1st commit message:

fix

# This is the commit message #2:

test

# This is the commit message #3:

fix ci

# This is the commit message #4:

fix ci

# This is the commit message #5:

remove namespaces because oss agent response doesnt provide it
Copy link
Contributor

@thisisnotashwin thisisnotashwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done!!

connect-inject/endpoints_controller.go Outdated Show resolved Hide resolved
@@ -208,13 +226,26 @@ func (r *EndpointsController) createServiceRegistrations(pod corev1.Pod, service
tags = append(tags, strings.Split(raw, ",")...)
}

status, _, err := getReadyStatusAndReason(&pod)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good. What exactly gets jumbled here though just for my own understanding?

Copy link
Member

@lkysow lkysow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. I tested it out and it worked as expected!

Couple comments on things. Also I noticed that we're not testing what happens when the pod is in pending phase:

image

Not sure if that makes sense to test in the endpoints controller use-case but we did have that test before in the health check controller called "inject init container has completed but containers not yet running",.

connect-inject/endpoints_controller.go Outdated Show resolved Hide resolved
connect-inject/endpoints_controller.go Outdated Show resolved Hide resolved
connect-inject/endpoints_controller.go Outdated Show resolved Hide resolved
@@ -208,13 +226,26 @@ func (r *EndpointsController) createServiceRegistrations(pod corev1.Pod, service
tags = append(tags, strings.Split(raw, ",")...)
}

status, _, err := getReadyStatusAndReason(&pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good note and I think it would be good as a comment in the code since I'm sure readers would wonder!

subcommand/inject-connect/command.go Show resolved Hide resolved
connect-inject/endpoints_controller_test.go Outdated Show resolved Hide resolved
connect-inject/endpoints_controller_test.go Outdated Show resolved Hide resolved
connect-inject/endpoints_controller_test.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Show resolved Hide resolved
@kschoche
Copy link
Contributor Author

kschoche commented Apr 7, 2021

This is looking great. I tested it out and it worked as expected!

Couple comments on things. Also I noticed that we're not testing what happens when the pod is in pending phase:

image

Not sure if that makes sense to test in the endpoints controller use-case but we did have that test before in the health check controller called "inject init container has completed but containers not yet running",.

Excellent comments and questions @lkysow, thanks for the review!
I've addressed I think all of the comments except the question of whether or not to test the PodPending codepath. Now that I think about it I need to do a bit of research to see whether or not it's even possible to be in this case because I do not know if a PodPending pod would ever be in an endpoint object. Let me do a little research and come back to that before we proceed!

Copy link
Member

@lkysow lkysow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦊

connect-inject/endpoints_controller_test.go Outdated Show resolved Hide resolved
},
}
if inject {
pod.Labels[labelInject] = injected
pod.Labels[annotationStatus] = injected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize now after seeing this that annotationStatus could probably have a better name since it's used for both annotation and label now. Not a blocker though but something to address at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I was going through all of the places to change this it occurred to me its pretty wonky.
The label const doesn't belong in annotations.go either :(

I'll try to clean it up before GA if we can come to an agreement on where it belongs?

subcommand/inject-connect/command_test.go Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Show resolved Hide resolved
@kschoche kschoche merged commit 2f144ec into feature-tproxy Apr 8, 2021
@kschoche kschoche deleted the tproxy-health-checks-controller branch April 8, 2021 13:32
ishustava pushed a commit that referenced this pull request Apr 14, 2021
…th checks (#472)

remove health checks controller and use endpoints controller for health checks.
thisisnotashwin pushed a commit that referenced this pull request Apr 16, 2021
…th checks (#472)

remove health checks controller and use endpoints controller for health checks.
thisisnotashwin pushed a commit that referenced this pull request Apr 16, 2021
…th checks (#472)

remove health checks controller and use endpoints controller for health checks.
thisisnotashwin pushed a commit that referenced this pull request Apr 22, 2021
…th checks (#472)

remove health checks controller and use endpoints controller for health checks.
thisisnotashwin pushed a commit that referenced this pull request Apr 28, 2021
…th checks (#472)

remove health checks controller and use endpoints controller for health checks.
thisisnotashwin pushed a commit that referenced this pull request May 4, 2021
…th checks (#472)

remove health checks controller and use endpoints controller for health checks.
thisisnotashwin pushed a commit that referenced this pull request May 4, 2021
…th checks (#472)

remove health checks controller and use endpoints controller for health checks.
thisisnotashwin pushed a commit that referenced this pull request May 6, 2021
…th checks (#472)

remove health checks controller and use endpoints controller for health checks.
thisisnotashwin added a commit that referenced this pull request May 6, 2021
thisisnotashwin added a commit that referenced this pull request May 6, 2021
ndhanushkodi pushed a commit to ndhanushkodi/consul-k8s that referenced this pull request Jul 9, 2021
Update values docs to match website
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/health-checks About Consul health checking theme/tproxy Items related to transparent proxy type/enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants