Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logging of predicates #10818

Closed
2 tasks done
sbueringer opened this issue Jul 1, 2024 · 21 comments
Closed
2 tasks done

Improve logging of predicates #10818

sbueringer opened this issue Jul 1, 2024 · 21 comments
Assignees
Labels
area/logging Issues or PRs related to logging help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sbueringer
Copy link
Member

sbueringer commented Jul 1, 2024

The logger we use in predicates today only contain an absolute minimum amount of k/v pairs, e.g.
image

Some notes:

  • it's impossible to figure out in which controller this predicate is used
  • kind (the 6th value) is just an empty string

I think we have to improve our predicate logs, otherwise they are not really useful.

(I noticed this when trying to debug predicates of a controller, which is very very hard if there is no way to determine which log belongs to which controller)

In general we should compare the logs of our predicates with the logs of a regular controller and figure out which additional k/v pairs we need + we should make sure kind is logged correctly

Tasks:

  • Add controller k/v pair to all predicates: 🌱 Log controller name in predicates #10959
  • Add only one k/v pair with <kind>: <namespace>/<name> (currently we often have an empty kind string + namespace + name k/v pairs)
    • This requires changes to the function signature of our predicate funcs, e.g. one option is to pass in a scheme so we can get the correct kind from the object. An alternative would be to pass in the kind directly. Both need changes to the func signatures.
@sbueringer sbueringer added the area/logging Issues or PRs related to logging label Jul 1, 2024
@k8s-ci-robot k8s-ci-robot added needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 1, 2024
@sbueringer
Copy link
Member Author

cc @fabriziopandini @chrischdi (just fyi, I think it's not very important, but it would be good for debuggability)

@fabriziopandini fabriziopandini added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 3, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-kind Indicates a PR lacks a `kind/foo` label and requires one. label Jul 3, 2024
@fabriziopandini fabriziopandini added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jul 3, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates an issue lacks a `priority/foo` label and requires one. label Jul 3, 2024
@fabriziopandini
Copy link
Member

Let's try to solve for one controller first, then we can try to get help in replicating the solution for the remaining controllers
/triage accepted
/help

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

Let's try to solve for one controller first, then we can try to get help in replicating the solution for the remaining controllers
/triage accepted
/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 3, 2024
@Karthik-K-N
Copy link
Contributor

I was debugging something related to pausing resource for CAPIBM and thinking of adding additional values in predicate logging for debugging locally and in the meantime came across this issue,
I would be happy to work on this if no one is currently working on it.

@sbueringer
Copy link
Member Author

Feel free! I would recommend implementing it for one controller first, so we can get consensus on how to implement it (before implementing it for everything)

@Karthik-K-N
Copy link
Contributor

/assign

@Karthik-K-N
Copy link
Contributor

Karthik-K-N commented Jul 29, 2024

I just tried to debug on kind value being empty string, Seems like on update events on few occasion the value is empty

{"ts":1722261991766.5825,"caller":"predicates/generic_predicates.go:194","msg":"Resource is not paused, will attempt to map resource","controller":"cluster_controller","kind":"Cluster","predicate":"ResourceNotPaused","eventType":"update","namespace":"default","":"development-16409-md-0-pmcc8-9wjbg-kkx8f","v":6}

{"ts":1722261389067.7864,"caller":"predicates/generic_predicates.go:194","msg":"Resource is not paused, will attempt to map resource","controller":"cluster_controller","kind":"Cluster","predicate":"ResourceNotPaused","eventType":"update","namespace":"default","cluster":"development-16409","v":6}


Even this is not consistent, Seen empty most of the time.

@sbueringer
Copy link
Member Author

sbueringer commented Sep 11, 2024

@Karthik-K-N I just talked to Christian and Fabrizio and we have an idea on how to solve the issue with kind. The problem in general is that kind is usually never set for typed objects, only for unstructured. We have to use apiutil.GVKForObject to properly retrieve the kind

Basically we would modify all predicates to take an additional scheme as parameter, e.g.

ResourceHasFilterLabel(logger logr.Logger, labelValue string)

would become

ResourceHasFilterLabel(scheme *runtime.Scheme, logger logr.Logger, labelValue string)

We would call it like this in our setup funcs

predicates.ResourceHasFilterLabel(mgr.GetScheme(), predicateLog, r.WatchFilterValue)

Inside of the predicate we would pass the scheme to where we have access to the object and then do the following:

func processIfLabelMatch(scheme *runtime.Scheme, log logr.Logger, obj client.Object, labelValue string) bool {
	// Return early if no labelValue was set.
	if labelValue == "" {
		return true
	}

	if gvk, err := apiutil.GVKForObject(obj, scheme); err == nil {
		log = log.WithValues(gvk.Kind, klog.KObj(obj))
	}
	if labels.HasWatchLabel(obj, labelValue) {
		log.V(6).Info("Resource matches label, will attempt to map resource")
		return true
	}
	log.V(4).Info("Resource does not match label, will not attempt to map resource")
	return false
}

In general I would like all of our predicate logs to have at least the following k/v pairs:

  • controller: <controller-name>
  • predicate: <predicate-name>
  • eventType: create|update|delete|generic
  • <kind>: <namespace>/<name> (klog.KObj takes care of that)

I would like to at least drop the "namespace" k/v pair as it is redundant.

If you're interested in implementing this. I would suggest let's start with one predicate first (e.g. ResourceNotPaused + also affects at least ResourceNotPausedAndHasFilterLabel or maybe another one not sure. Ideally let's pick a predicate that doesn't affect a lot of other predicates :))

@Karthik-K-N
Copy link
Contributor

Sure @sbueringer , Thanks for the detailed explanation, I will submit a PR with change for one predicate, My only earlier concern was predicate signature change.

@sbueringer
Copy link
Member Author

sbueringer commented Sep 11, 2024

My only #10959 (comment) concern was predicate signature change.

Yup, same. Talked to Fabrizio and Christian. We would be fine with it. It's an easy change to adopt to, as folks can always just pass in mgr.GetScheme in the places where Predicates are used

(sorry I think I missed your comment on the PR somehow :))

@Karthik-K-N
Copy link
Contributor

Sure, Thanks for confirmation. I will go ahead with it. Will submit PR soon.

@Karthik-K-N
Copy link
Contributor

@sbueringer so should I make the similar changes for rest of the predicates or should we wait?

@sbueringer
Copy link
Member Author

Feel free to go ahead.

@sbueringer
Copy link
Member Author

@Karthik-K-N Please also follow-up on: #11188 (comment)

@Karthik-K-N
Copy link
Contributor

@Karthik-K-N Please also follow-up on: #11188 (comment)

Sure will do that as well. Thank you.

@sbueringer
Copy link
Member Author

@Karthik-K-N Please also follow-up on: #11188 (comment)

This should be the last remaining task of this issue I think

@Karthik-K-N
Copy link
Contributor

Hi, Yes, I am about to start on it, I will submit a PR in couple of days. Apologies for the delay.

@sbueringer
Copy link
Member Author

No worries and no rush :) Just wanted to let folks know that this is the last part missing and this issue is otherwise done :)

Thx for working on it!

@Karthik-K-N
Copy link
Contributor

@Karthik-K-N Please also follow-up on: #11188 (comment)

This should be the last remaining task of this issue I think

This is completed via PR #11283

@sbueringer Can this issue be closed or is anything pending?

@sbueringer
Copy link
Member Author

All good. Thank you very much!!

/close

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Closing this issue.

In response to this:

All good. Thank you very much!!

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/logging Issues or PRs related to logging help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants