NE-367: Add logLevel and operatorLogLevel APIs for DNS #931

miheer · 2021-10-14T10:22:51Z

Add logLevel and operatorLogLevel APIs for DNS https://issues.redhat.com/browse/NE-367

miheer · 2021-10-14T10:59:55Z

Miciah

I have some comments regarding markup issues, copy edits, and points of clarification, but overall this looks great.

Miciah · 2021-10-14T22:47:25Z

enhancements/dns/dns-operator-operand-logging-level.md

+Supporting a trivial way to raise the verbosity of the DNS Operator and it's Operands (CoreDNS) would make debugging
+cluster dns operator and CoreDNS issues easier for cluster administrators and OpenShift developers.


Suggested change

Supporting a trivial way to raise the verbosity of the DNS Operator and it's Operands (CoreDNS) would make debugging

cluster dns operator and CoreDNS issues easier for cluster administrators and OpenShift developers.

Supporting a trivial way to raise the verbosity of the DNS Operator and its Operands (CoreDNS) would make debugging

the Operator and CoreDNS issues easier for cluster administrators and OpenShift developers.

Miciah · 2021-10-14T22:49:46Z

enhancements/dns/dns-operator-operand-logging-level.md

+Supporting a trivial way to raise the verbosity of the DNS Operator and it's Operands (CoreDNS) would make debugging
+cluster dns operator and CoreDNS issues easier for cluster administrators and OpenShift developers.
+
+To have the CoreDNS enable the following classes such as error, denial and all.


This isn't a complete sentence.

Suggested change

To have the CoreDNS enable the following classes such as error, denial and all.

For logging purposes, CoreDNS defines several classes of responses, such as error, denial and all.

Miciah · 2021-10-14T22:51:33Z

enhancements/dns/dns-operator-operand-logging-level.md

+To have the CoreDNS enable the following classes such as error, denial and all.
+denial: either NXDOMAIN or nodata responses (Name exists, type does not). A nodata response sets the return code to NOERROR.
+error: SERVFAIL, NOTIMP, REFUSED, etc. Anything that indicates the remote server is not willing to resolve the request.
+all: the default - nothing is specified. Using of this class means that all messages will be logged whatever we mix together with "all".


What do you mean by "nothing is specified"? I'm not sure what you mean by "whatever we mix together with 'all'"; do you mean that "all" takes precedence when CoreDNS is configured to log a list of classes that includes "all"? Is that important to mention here? Remember, this is the motivation section; implementation details belong elsewhere.

Suggested change

all: the default - nothing is specified. Using of this class means that all messages will be logged whatever we mix together with "all".

all: all responses, including successful responses, errors, and denials.

Miciah · 2021-10-14T22:55:38Z

enhancements/dns/dns-operator-operand-logging-level.md

+desire more in-depth logging statements when working on the operator's controllers.
+
+Additionally, a logging level API for CoreDNS logs would assist cluster administrators who wish to have more control
+over the output of their DNSController's CoreDNS logs.


Suggested change

over the output of their DNSController's CoreDNS logs.

over CoreDNS logs.

Miciah · 2021-10-14T22:56:13Z

enhancements/dns/dns-operator-operand-logging-level.md

+
+### Non-Goals
+
+* Change the default logging verbosity of the DNS Operator or the CoreDNS in production OCP clusters.


Suggested change

* Change the default logging verbosity of the DNS Operator or the CoreDNS in production OCP clusters.

* Change the default logging verbosity of the DNS Operator or CoreDNS in production OCP clusters.

Miciah · 2021-10-14T23:33:09Z

enhancements/dns/dns-operator-operand-logging-level.md

+
+### Risks and Mitigations
+
+Raising the logging verbosity for any component typically results in larger log files that grow quickly.


If possible, each risk should have a mitigation. For example, the godoc for the new API field could include a warning that setting logLevel: Trace will produce extremely verbose logs.

@Miciah Do you mean add a comment in the API PR and recreate the go docs

No, I mean the point of having a "Risks and Mitigations" section is that you'll describe the mitigation for each risk.

Miciah · 2021-10-14T23:33:51Z

enhancements/dns/dns-operator-operand-logging-level.md

+### Upgrade / Downgrade Strategy
+
+On downgrade, any logging options are ignored by the DNS Operator and CoreDNS.
+A harmless logging level configmap in the `openshift-dns` namespace may be left behind.


Why would there be an extra configmap?

I think I need to remove this. the downgraded operator will update the configmap and delete the log stanzas.

Miciah · 2021-10-14T23:37:40Z

enhancements/dns/dns-operator-operand-logging-level.md

+Also, a logging level API for the DNS Operator would assist OpenShift developers working on the DNS Operator who may
+desire more in-depth logging statements when working on the operator's controllers.
+
+Additionally, a logging level API for CoreDNS logs would assist cluster administrators who wish to have more control
+over the output of their DNSController's CoreDNS logs.


This section would flow better if you swapped these two paragraphs (adjusting transitions as necessary).

Miciah · 2021-10-14T23:38:09Z

enhancements/dns/dns-operator-operand-logging-level.md

+
+### Goals
+
+* Add a user-facing API for controlling the run-time verbosity of the [OpenShift DNS Operator and CoreDNS](https://github.com/openshift/cluster-dns-operator)


Suggested change

* Add a user-facing API for controlling the run-time verbosity of the [OpenShift DNS Operator and CoreDNS](https://github.com/openshift/cluster-dns-operator)

* Add a user-facing API for controlling the run-time verbosity of the [OpenShift DNS Operator and CoreDNS](https://github.com/openshift/cluster-dns-operator).

Miciah · 2021-10-14T23:50:26Z

enhancements/dns/dns-operator-operand-logging-level.md

+
+Valid values for logLevel are: "Normal", "Debug", "Trace", "TraceAll" as per [Operator Api](https://github.com/openshift/api/blob/master/operator/v1/types.go#L62)
+
+We will be enabling the CLASSES(https://github.com/coredns/coredns/tree/master/plugin/log#syntax) of coredns w.r.t to the logLevel we have defined in openshift api.


Suggested change

We will be enabling the CLASSES(https://github.com/coredns/coredns/tree/master/plugin/log#syntax) of coredns w.r.t to the logLevel we have defined in openshift api.

We will enable logging of CoreDNS's [classes of responses](https://github.com/coredns/coredns/tree/master/plugin/log#syntax) that correspond to the log level specified in the API.

sttts · 2021-10-15T07:44:27Z

enhancements/dns/dns-operator-operand-logging-level.md

+	// See LogLevel for more information about each available logging level.
+	//
+	// +optional
+	OperatorLogLevel LogLevel `json:"operatorLogLevel"`


add defaulting to the standard loglevel.

sttts · 2021-10-15T07:44:40Z

enhancements/dns/dns-operator-operand-logging-level.md

+	// logLevel describes the logging verbosity of the DNSController for CoreDNS.
+	//
+	// +optional
+	LogLevel LogLevel `json:"logLevel"`


add defaulting to the standard loglevel.

sorry @sttts forgot to update here as had added it the api openshift/api#1031
@sttts just a request if would be able to review openshift/api#1031 then that will be great.

@sttts PTAL just added the changes you requested. Thanks in advance!

commented on the api

openshift-bot · 2021-11-12T16:09:05Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2021-11-19T16:15:19Z

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

miheer · 2021-11-26T02:27:44Z

@Miciah @alebedev87 @arjunrn PTAL

openshift-bot · 2021-12-03T10:29:12Z

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2021-12-03T10:29:41Z

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

miheer · 2021-12-11T03:11:00Z

/remove-lifecycle rotten

miheer · 2021-12-11T03:13:36Z

/reopen

openshift-ci · 2021-12-11T03:15:08Z

@miheer: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Miciah · 2021-12-11T03:14:35Z

enhancements/dns/dns-operator-operand-logging-level.md

+For logging purposes, CoreDNS defines several classes of responses, such as error, denial and all.
+denial: either NXDOMAIN or nodata responses (Name exists, type does not). A nodata response sets the return code to NOERROR.
+error: SERVFAIL, NOTIMP, REFUSED, etc. Anything that indicates the remote server is not willing to resolve the request.
+all: all responses, including successful responses, errors, and denials.


Please fix the formatting.

Suggested change

For logging purposes, CoreDNS defines several classes of responses, such as error, denial and all.

denial: either NXDOMAIN or nodata responses (Name exists, type does not). A nodata response sets the return code to NOERROR.

error: SERVFAIL, NOTIMP, REFUSED, etc. Anything that indicates the remote server is not willing to resolve the request.

all: all responses, including successful responses, errors, and denials.

For logging purposes, CoreDNS defines several classes of responses:

* denial: NXDOMAIN and nodata responses (name exists, record type does not). A nodata response sets the return code to NOERROR.

* error: SERVFAIL, NOTIMP, REFUSED, etc. Anything that indicates the remote server is not willing to resolve the request.

* all: all responses, including successful responses, errors, and denials.

Miciah · 2021-12-11T03:18:17Z

enhancements/dns/dns-operator-operand-logging-level.md

+## Proposal
+
+### DNS Operator Log Level API
+We will be creating an API for **field** `operatorLogLevel` in DNSSpec in accordance with `DNSLogLevel` type.


Why the emphasis? You ended up defining a new type, right?

Suggested change

We will be creating an API for **field** `operatorLogLevel` in DNSSpec in accordance with `DNSLogLevel` type.

We will be defining a new API field `operatorLogLevel` in `DNSSpec` with newly defined type `DNSLogLevel`.

This type is similar to the existing `LogLevel` type except that the values of `DNSLogLevel` are a subset of the values of `LogLevel`.

Miciah · 2021-12-11T03:18:35Z

enhancements/dns/dns-operator-operand-logging-level.md

+
+### DNS Operator Log Level API
+We will be creating an API for **field** `operatorLogLevel` in DNSSpec in accordance with `DNSLogLevel` type.
+Valid values will be : "Normal", "Debug", "Trace".


Suggested change

Valid values will be : "Normal", "Debug", "Trace".

Valid values will be the following: "Normal", "Debug", "Trace".

Miciah · 2021-12-11T03:21:30Z

enhancements/dns/dns-operator-operand-logging-level.md

+The CoreDNS reloads its configuration without requiring a restart, so the operator can adjust CoreDNS's log level just by updating the Corefile configmap without need to restart the pod.
+
+
+We will be adding an API field `operatorlogLevel` in DNSSpec for the type DNSLogLevel


Suggested change

We will be adding an API field `operatorlogLevel` in DNSSpec for the type DNSLogLevel

We will be adding an API field `operatorlogLevel` in `DNSSpec` with the type `DNSLogLevel`:

Miciah · 2021-12-11T03:22:25Z

enhancements/dns/dns-operator-operand-logging-level.md

+```
+This new field would allow a cluster administrator to specify the desired logging level specifically for the DNS Operator.
+
+Additionally, a new `LogLevel` of type `DNSLogLevel` will be added for CoredDNS logging :


Suggested change

Additionally, a new `LogLevel` of type `DNSLogLevel` will be added for CoredDNS logging :

Additionally, a new API field `LogLevel` of type `DNSLogLevel` will be added to specify the log level for CoreDNS:

Miciah · 2021-12-11T03:27:03Z

enhancements/dns/dns-operator-operand-logging-level.md

+  Adding this prometheus alert is nice, but it would be more useful we can see which request are getting SERVFAIL response.
+  So we would to enable the log plugin for CoreDNS to log queries.


Suggested change

Adding this prometheus alert is nice, but it would be more useful we can see which request are getting SERVFAIL response.

So we would to enable the log plugin for CoreDNS to log queries.

Adding this Prometheus alert is useful, but it would be more useful if we could see which requests were getting SERVFAIL responses.

So we would like to configure the log plugin for CoreDNS to log queries.

Miciah · 2021-12-11T03:27:53Z

enhancements/dns/dns-operator-operand-logging-level.md

+* Some users recently added the new alert 'CoreDNS is returning SERVFAIL for X% of requests alert' to the recent updates of OCP.
+  Adding this prometheus alert is nice, but it would be more useful we can see which request are getting SERVFAIL response.
+  So we would to enable the log plugin for CoreDNS to log queries.
+
+* Some user want to avoid use of tcpdump to see the queries and want log plugin to be enabled to log queries in coredns.


I believe my previous comment here still applies.

Miciah · 2021-12-11T03:29:04Z

enhancements/dns/dns-operator-operand-logging-level.md

+
+### Risks and Mitigations
+
+Raising the logging verbosity for any component typically results in larger log files that grow quickly.


No, I mean the point of having a "Risks and Mitigations" section is that you'll describe the mitigation for each risk.

Miciah · 2021-12-11T03:31:26Z

enhancements/dns/dns-operator-operand-logging-level.md

+* Don't provide any DNS logging level APIs for the operator and coredns (current behavior)
+* Raise current verbosity of the DNS Operator and coredns (not desirable)


A third alternative is tcpdump.

Normally we'd want a little more discussion as to why the alternative was not ~~considered~~ used.

Miciah · 2021-12-11T03:31:28Z

enhancements/dns/dns-operator-operand-logging-level.md

+### API Extensions
+
+### Operational Aspects of API Extensions
+
+#### Failure Modes
+
+#### Support Procedures


These should be filled out.

openshift-bot · 2022-01-08T05:08:00Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2022-01-15T11:04:59Z

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2022-01-22T11:07:04Z

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2022-01-22T11:07:33Z

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

miheer · 2022-01-27T13:44:53Z

/remove-lifecycle rotten

miheer · 2022-01-27T13:45:15Z

/reopen

openshift-ci · 2022-01-27T13:45:47Z

@miheer: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

…com/browse/NE-367

Miciah

Thanks! We can address remaining issues in a follow-up.
/lgtm

Miciah · 2022-02-22T00:27:44Z

enhancements/dns/dns-operator-operand-logging-level.md

+* all: all responses, including successful responses, errors, and denials.
+A logging level API for CoreDNS logs would assist cluster administrators who wish to have more control
+over CoreDNS logs.


Needs a linebreak to prevent lines 53-54 from being formatted as part of the bullet on line 52.

Suggested change

* all: all responses, including successful responses, errors, and denials.

A logging level API for CoreDNS logs would assist cluster administrators who wish to have more control

over CoreDNS logs.

* all: all responses, including successful responses, errors, and denials.

A logging level API for CoreDNS logs would assist cluster administrators who wish to have more control

over CoreDNS logs.

Miciah · 2022-02-22T00:30:18Z

enhancements/dns/dns-operator-operand-logging-level.md

+* Some users want to add new prometheus alert 'CoreDNS is returning SERVFAIL for X% of requests alert' to the recent updates of OCP.
+  Adding this Prometheus alert is useful, but it would be more useful if we could see which requests were getting SERVFAIL responses.
+  So we would like to configure the log plugin for CoreDNS to log queries.


I'm still confused as to what the use-case is here, or how this relates to the existing alert: https://github.com/openshift/cluster-dns-operator/blob/531e38a2f640bfbefb7126d6e701afad8db6f911/manifests/0000_90_dns-operator_03_prometheusrules.yaml#L32-L43

Miciah · 2022-02-22T00:33:16Z

enhancements/dns/dns-operator-operand-logging-level.md

+```
+After the logging level on a Logger is set, log entries with that severity or anything above it will be logged.
+For example, `log.SetLevel(log.InfoLevel)` will log anything that is info or above (warn, error, fatal, panic).  This is the default log level.  
+So, we will be reading `operatorLogLevel` in a separate controller to watch dnses and setting log level.


You didn't end up using a separate controller, so this text should be updated when you get the chance.

Miciah · 2022-02-22T00:37:54Z

enhancements/dns/dns-operator-operand-logging-level.md

+```go
+// operatorLogLevel controls the logging level of the DNS Operator.
+// Valid values are: "Normal", "Debug", "Trace".
+// Defaults to "Normal".
+// setting operatorLogLevel: Trace will produce extremely verbose logs.
+// +optional
+// +kubebuilder:default=Normal
+OperatorLogLevel DNSLogLevel `json:"operatorLogLevel,omitempty"`
+```
+This new field would allow a cluster administrator to specify the desired logging level specifically for the DNS Operator.
+
+Additionally, a new API field `LogLevel` of type `DNSLogLevel` will be added to specify the log level for CoreDNS:
+```go
+// logLevel describes the desired logging verbosity for CoreDNS.
+// Any one of the following values may be specified:
+// * Normal logs errors from upstream resolvers.
+// * Debug logs errors, NXDOMAIN responses, and NODATA responses.
+// * Trace logs errors and all responses.
+//  Setting logLevel: Trace will produce extremely verbose logs.
+// Valid values are: "Normal", "Debug", "Trace".
+// Defaults to "Normal".
+// +optional
+// +kubebuilder:default=Normal
+LogLevel DNSLogLevel `json:"logLevel,omitempty"`
+```
+
+Both of these new API fields use the aforementioned `DNSLogLevel` type, which is defined as follows:
+```go
+
+// +kubebuilder:validation:Enum:=Normal;Debug;Trace
+type DNSLogLevel string
+
+var (
+// Normal is the default.  Normal, working log information, everything is fine, but helpful notices for auditing or common operations.  In kube, this is probably glog=2.
+DNSLogLevelNormal DNSLogLevel = "Normal"
+
+// Debug is used when something went wrong.  Even common operations may be logged, and less helpful but more quantity of notices.  In kube, this is probably glog=4.
+DNSLogLevelDebug DNSLogLevel = "Debug"
+
+// Trace is used when something went really badly and even more verbose logs are needed.  Logging every function call as part of a common operation, to tracing execution of a query.  In kube, this is probably glog=6.
+DNSLogLevelTrace DNSLogLevel = "Trace"
+)
+
+```


The definitions we ended up with in openshift/api#1031 look a bit different; the enhancement text should be updated eventually.

Miciah · 2022-02-22T00:42:38Z

/approve

openshift-ci · 2022-02-22T00:43:08Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [Miciah]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2022-02-22T00:48:53Z

@miheer: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci bot requested review from russellb and sttts October 14, 2021 10:23

miheer force-pushed the dns-logging branch 8 times, most recently from fbfa319 to f6e231e Compare October 14, 2021 10:55

Miciah reviewed Oct 14, 2021

View reviewed changes

Miciah mentioned this pull request Oct 15, 2021

Add logLevel and operatorLogLevel APIs for DNS https://issues.redhat.com/browse/NE-367 openshift/api#1031

Merged

miheer force-pushed the dns-logging branch from f6e231e to 194f83d Compare October 15, 2021 04:26

sttts reviewed Oct 15, 2021

View reviewed changes

miheer force-pushed the dns-logging branch 3 times, most recently from 095c433 to 4d713d6 Compare October 15, 2021 13:35

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 12, 2021

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 19, 2021

miheer force-pushed the dns-logging branch from 4d713d6 to a6e5629 Compare November 26, 2021 02:25

miheer force-pushed the dns-logging branch from a6e5629 to d7872e1 Compare November 26, 2021 06:00

openshift-ci bot closed this Dec 3, 2021

openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Dec 11, 2021

openshift-ci bot reopened this Dec 11, 2021

Miciah reviewed Dec 11, 2021

View reviewed changes

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 8, 2022

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 15, 2022

openshift-ci bot closed this Jan 22, 2022

openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 27, 2022

openshift-ci bot reopened this Jan 27, 2022

miheer force-pushed the dns-logging branch from d7872e1 to 7aca3af Compare January 28, 2022 03:11

Add logLevel and operatorLogLevel APIs for DNS https://issues.redhat.…

3b4560a

…com/browse/NE-367

miheer force-pushed the dns-logging branch from 7aca3af to 3b4560a Compare January 28, 2022 03:18

Miciah reviewed Feb 22, 2022

View reviewed changes

openshift-ci bot assigned Miciah Feb 22, 2022

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 22, 2022

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 22, 2022

openshift-merge-robot merged commit fd51690 into openshift:master Feb 22, 2022

		Supporting a trivial way to raise the verbosity of the DNS Operator and it's Operands (CoreDNS) would make debugging
		cluster dns operator and CoreDNS issues easier for cluster administrators and OpenShift developers.

	To have the CoreDNS enable the following classes such as error, denial and all.
	For logging purposes, CoreDNS defines several classes of responses, such as error, denial and all.

	all: the default - nothing is specified. Using of this class means that all messages will be logged whatever we mix together with "all".
	all: all responses, including successful responses, errors, and denials.

	over the output of their DNSController's CoreDNS logs.
	over CoreDNS logs.


		### Non-Goals

		* Change the default logging verbosity of the DNS Operator or the CoreDNS in production OCP clusters.


		### Risks and Mitigations

		Raising the logging verbosity for any component typically results in larger log files that grow quickly.


		### Goals

		* Add a user-facing API for controlling the run-time verbosity of the [OpenShift DNS Operator and CoreDNS](https://github.com/openshift/cluster-dns-operator)


		Valid values for logLevel are: "Normal", "Debug", "Trace", "TraceAll" as per [Operator Api](https://github.com/openshift/api/blob/master/operator/v1/types.go#L62)

		We will be enabling the CLASSES(https://github.com/coredns/coredns/tree/master/plugin/log#syntax) of coredns w.r.t to the logLevel we have defined in openshift api.

	We will be enabling the CLASSES(https://github.com/coredns/coredns/tree/master/plugin/log#syntax) of coredns w.r.t to the logLevel we have defined in openshift api.
	We will enable logging of CoreDNS's [classes of responses](https://github.com/coredns/coredns/tree/master/plugin/log#syntax) that correspond to the log level specified in the API.

	We will be creating an API for field `operatorLogLevel` in DNSSpec in accordance with `DNSLogLevel` type.
	We will be defining a new API field `operatorLogLevel` in `DNSSpec` with newly defined type `DNSLogLevel`.
	This type is similar to the existing `LogLevel` type except that the values of `DNSLogLevel` are a subset of the values of `LogLevel`.

	Valid values will be : "Normal", "Debug", "Trace".
	Valid values will be the following: "Normal", "Debug", "Trace".

		The CoreDNS reloads its configuration without requiring a restart, so the operator can adjust CoreDNS's log level just by updating the Corefile configmap without need to restart the pod.


		We will be adding an API field `operatorlogLevel` in DNSSpec for the type DNSLogLevel

	Additionally, a new `LogLevel` of type `DNSLogLevel` will be added for CoredDNS logging :
	Additionally, a new API field `LogLevel` of type `DNSLogLevel` will be added to specify the log level for CoreDNS:

		Adding this prometheus alert is nice, but it would be more useful we can see which request are getting SERVFAIL response.
		So we would to enable the log plugin for CoreDNS to log queries.

		* Don't provide any DNS logging level APIs for the operator and coredns (current behavior)
		* Raise current verbosity of the DNS Operator and coredns (not desirable)

NE-367: Add logLevel and operatorLogLevel APIs for DNS #931

NE-367: Add logLevel and operatorLogLevel APIs for DNS #931

Conversation

miheer commented Oct 14, 2021

miheer commented Oct 14, 2021

Miciah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-bot commented Nov 12, 2021

openshift-bot commented Nov 19, 2021

miheer commented Nov 26, 2021 • edited Loading

openshift-bot commented Dec 3, 2021

openshift-ci bot commented Dec 3, 2021

miheer commented Dec 11, 2021

miheer commented Dec 11, 2021

openshift-ci bot commented Dec 11, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Miciah Dec 11, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-bot commented Jan 8, 2022

openshift-bot commented Jan 15, 2022

openshift-bot commented Jan 22, 2022

openshift-ci bot commented Jan 22, 2022

miheer commented Jan 27, 2022

miheer commented Jan 27, 2022

openshift-ci bot commented Jan 27, 2022

Miciah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Miciah commented Feb 22, 2022

openshift-ci bot commented Feb 22, 2022

openshift-ci bot commented Feb 22, 2022

miheer commented Nov 26, 2021 •

edited

Loading

Miciah Dec 11, 2021 •

edited

Loading