-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-1358: splitting controllers #503
Conversation
@jotak: This pull request references NETOBSERV-1358 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
pkg/manager/status/status_manager.go
Outdated
type Manager struct { | ||
sync.Mutex | ||
|
||
statuses map[ComponentName]ComponentStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u use syncMap here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good ... sometimes I don't like using sync.Map as the API is less user-friendly, and they don't always offer better performances ; but I just tried here and it plays nicely
@jotak: This pull request references NETOBSERV-1358 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments.
If some comments are related to code you moved but did not change, please ignore them.
} | ||
|
||
// Check namespace changed | ||
if ns != previousNamespace { | ||
if err := r.handleNamespaceChanged(ctx, previousNamespace, ns, desired, &flpReconciler, &cpReconciler); err != nil { | ||
return ctrl.Result{}, r.failure(ctx, conditions.CannotCreateNamespace(err), desired) | ||
if previousNamespace != "" && r.mgr.HasConsolePlugin() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR.
Handling namespace change create some complexity that is not necessary IMO.
Apparently it is possible to make a field immutable with kubebuilder pattern. I think it could be a nice compromise.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the namespace thing is a pain.. We can make it immutable yes - we then loose the ability to reconfigure it on the fly, but I guess it's quite a rare operation anyway, and the workaround would just be to manually delete then re-install the flowcollector at the desired namespace. That would interrupt flow collection, but it's also interrupted when you modify namespace, so.. yeah I think I'm good with that proposal :-)
serviceAccount: cmn.Managed.NewServiceAccount(name), | ||
configMap: cmn.Managed.NewConfigMap(configMapName(ConfKafkaIngester)), | ||
roleBinding: cmn.Managed.NewCRB(RoleBindingName(ConfKafkaIngester)), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new Managed API make this part more readable, this is nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I don't know why I didn't do that earlier :)
@jotak: This pull request references NETOBSERV-1358 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
/hold for premerge testing |
/ok-to-test |
New images:
They will expire after two weeks. To deploy this build: # Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:8718484 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-8718484 Or as a Catalog Source: apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-8718484
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m |
/label qe-approved |
@jotak: This pull request references NETOBSERV-1358 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jotak The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
- Start with a new "Monitoring" controllers that deals with ServiceMonitors / Dashboards etc - Create a Status Manager that gathers all statuses from each controller - Each status becomes eventually converted in a Condition (k8s status API), plus there is a global "Ready" status that is a merge of each component status
Also: - Centralize location of the kubebuilder rbac annotations in manager.go - Update the displayed column for CLI status with new conditions - Keep just 1 status per component, not 2 (merged errors & progress into a single status)
- Use status manager for FLP and sub-components (ingest/transfo/monolith) - Like status, namespace management needs to be per-controller. So I'm moving away from having a dedicated field in Status, and use per-component annotations instead - Simplify a bit the reconcilers "managed objects" stuff - Less verbose narrowcache and log context, better use named loggers
Description
edit: a next commit also extracted FLP reconciler as a new controller
Previously
We had only 1 controller, the FlowCollector controller, which managed all deployed components (FLP, console plugin, agents and a few other resources such as the monitoring dashboards)
The code was structured with each component managed by a "Reconciler". In terms of code structure, "reconcilers" are similar to "controllers", but at runtime it differs a lots because all "reconcilers" are managed synchronously, called from the same Reconcile loop, sharing reconcile events / cache & watches / etc.
Each reconciler were amending a global Status via hooks called
SetChanged
/SetInProgress
. On any error, the single whole reconcile loop would return, setting an error in status conditions.The FlowCollector status was amended at the end of the single reconcile loop to reflect any new status.
Now, with this PR
We have 2 controllers, and we plan to create more in follow-up work. There's still the main / legacy FlowCollector controller, from which I extracted functions related to the monitoring stack (dashboards, roles, annotating namespace, etc.). These functions are now managed by the new controller called Monitoring.
So each controller have their own configuration related to cache / watches, their own reconcile loop and requests queue. Their reconciliation code is run asynchronously from each other.
Status management still needs to be coordinated, since each controller may need to change the global status. This is achieved with a new Status Manager that holds a map keeping track of each controller's status, in a thread-safe way.
Each controller can call status manager's functions to update their component's status, such as with
r.status.SetFailure("MonitoringError", err.Error())
orr.status.SetReady()
.Typically, in a reconcile loop, a controller would:
defer r.status.Commit(ctx, r.Client)
to commit (synchronize) status at the end of ReconcileCheckDaemonSetProgress
andCheckDeploymentProgress
(they are called in shared functionsReconcileDeployment
/ReconcileDaemonSet
)r.status.SetFailure
orr.status.Error
(<= syntactic sugar) on errorWhen the
Commit
function is called, the Status Manager will merge all statuses into a global one, and write it down to the FlowCollector CRDependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.