Skip to content

Commit

Permalink
Enhancement proposal for monitoring Windows Nodes
Browse files Browse the repository at this point in the history
Enhancement proposal for enabling monitoring on
Windows nodes created by Windows Machine Config Operator(WMCO).
  • Loading branch information
VaishnaviHire committed Mar 10, 2021
1 parent 9eb5f69 commit 86f3541
Showing 1 changed file with 239 additions and 0 deletions.
239 changes: 239 additions & 0 deletions enhancements/windows-containers/monitoring-windows-nodes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
---
title: monitoring-windows-nodes
authors:
- "@VaishnaviHire"
- "@PratikMahajan"
reviewers:
- "@@openshift/openshift-team-windows-containers"
- "@simonpasquier"
- "@spadgett"
approvers:
- "@aravindhp"
- "@simonpasquier"
creation-date: 2021-02-08
last-updated: 2021-03-04
status: implementable
---

# Monitoring Windows Nodes

## Release Signoff Checklist

- [x] Enhancement is `implementable`
- [x] Design details are appropriately documented from clear requirements
- [x] Test plan is defined
- [ ] Operational readiness criteria is defined
- [x] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary

The intent of this enhancement is to enable performance monitoring on Windows
nodes created by Windows Machine Config Operator(WMCO) in OpenShift cluster.

## Motivation

Monitoring is critical to identify issues with nodes, containers running on the
nodes. The main motivation behind this enhancement is to enable monitoring on
the Windows nodes.

### Goals

As part of this enhancement, we plan to do the following:
* Run [windows_exporter](https://github.com/prometheus-community/windows_exporter)
as a service on Windows nodes
* Upgrade the windows_exporter on the Windows Nodes
* Leverage cluster-monitoring operator that sets up Prometheus, Alertmanager
and other components

### Non-Goals

As part of this enhancement, we do not plan to do the following:
* Integrating windows_exporter with cluster monitoring operator
* Ship Grafana dashboards for Windows Nodes

## Proposal

The main idea here is to run windows_exporter as a Windows Service and let
Prometheus instance which was provisioned as part of OpenShift install to
collect data from windows_exporter. The metrics exposed by the windows_exporter
will be used to display console graphs for Windows nodes.

## Justification

Unlike [Node exporter](https://github.com/prometheus/node_exporter) on Linux
nodes, windows_exporter cannot run as a container on the Windows nodes since
Windows container images contains a Windows Kernel and Red Hat has a policy not
to ship third party kernels for support reasons. Please refer to the [WMCO
enhancement](https://github.com/openshift/enhancements/blob/master/enhancements/windows-containers/windows-machine-config-operator.md#justification)
for more details.

### Risks and Mitigations

* Running `windows_exporter` as a Windows Service, posses a risk of having
inadequate resources to run the service if the Windows node is overwhelmed
with workload containers. This can be mitigated by leveraging [priority
classes](https://docs.microsoft.com/en-us/windows/win32/procthread/scheduling-priorities) for
Windows processes. This is similar to what is being done for other [Windows
services](https://issues.redhat.com/browse/WINC-534).

* One of the risks with the current approach is renaming Windows metrics to
display pod graphs. The pod metrics for Linux come from cAdvisor. However, we
do not get same metrics from cAdvisor for Windows nodes. This becomes a
hindrance to display pod graphs by creating custom recording rules to use same
console queries as Linux workloads. To mitigate this, use metrics exposed by
the windows_exporter to display pod graphs as mentioned in the [Future
Plans](#future-plans) is required. This also requires changes in console
queries that support OS specific metrics.

## Design Details

As we are not able to run windows_exporter as a [container](#justification)
on the Windows Node, to capture data from windows_exporter, WMCO creates a
`windows-machine-config-operator-metrics` Service without selectors and
manually defines Endpoints object for that service. The Endpoints object has
entries for the endpoints `<internal-ip>:9182/metrics`, exposed by
windows_exporter for every Windows node. Once the Service and Endpoints
object is created, WMCO ensures that a Service Monitor for `windows-machine-config-operator-metrics`
Service is running so that the Prometheus operator can discover the targets
created above to scrape Windows metrics. Following design details reflect the
current approach and future plans to enable monitoring support for Windows.

### Current State

To enable basic monitoring support for Windows node, WMCO has done the
following:

* Build and add windows_exporter binary to WMCO payload.
* Install windows_exporter on the Windows nodes and ensuring
that it runs as a Windows service.
* Add `openshift.io/cluster-monitoring=true` label to the
`openshift-windows-machine-config-operator` namespace so that cluster
monitoring stack will pick up the Service Monitor created by WMCO.
* Add privileges to WMCO to create Services, Endpoints, Service Monitor in
the `openshift-windows-machine-config-operator` namespace.
* Create a Service and Endpoints object in `openshift-windows-machine-config
-operator` namespace that point to windows_exporter endpoint. WMCO uses default
values to define metrics endpoint, `<internal-ip>:9182/metrics`,
exposed by windows_exporter for every Windows node. The Endpoints object
created in the namespace consist of subsets of endpoints from all the
Windows nodes.
* Create a Service Monitor in `openshift-windows-machine-config-operator`
namespace for Service created above.

To display node graphs WMCO has done the following:

* Add custom Prometheus rules in `openshift-windows-machine-config-operator`
namespace. The custom recording rules are created using Windows metrics
exposed by the windows_exporter and have the same names as Linux
recording rules. This is to make use of same console queries as Linux.
* Note that WMCO is unable to display pod graphs for the Windows Nodes
with the current implementation. See [Risks and Mitigations](#risks-and-mitigations)
for details.

### Future Plans

#### Displaying Console Graphs

* As we move forward, our plan to display monitoring graphs is to create a
[common interface](https://issues.redhat.com/browse/WINC-530) for Windows
and Linux recording rules. Monitoring team will define recording rules for the
metrics that have different `metric labels` for Linux and Windows. The
differences in `metric labels` for metrics used for Node graphs and pod graphs
are displayed in the tables below.
The Windows team will align the Windows recording rules with these new
recording rules. The recording rules for Windows will be managed by
WMCO. This set of common recording rules for monitoring will return results
for both Linux and Windows nodes for a single query.The console queries
currently use some raw metrics such as `node_filesystem_size_bytes`,
`node_filesystem_free_bytes` etc. They would need to be updated to include
the new recording rules in place of using raw metrics. This will ensure that
we have a consistent user experience for monitoring across Linux and Windows.
* In the cases where `metric labels` are equivalent, we plan to relabel the
Windows metrics to align with the Linux metrics.

**Node Metrics :**

| Node Exporter | Windows Exporter | Label Difference |
|--------------------------------|----------------------------------|--------------------------------------------------------------------------|
| node_memory_MemTotal_bytes | windows_cs_physical_memory_bytes | - |
| node_memory_MemAvailable_bytes | windows_memory_available_bytes | - |
| node_filesystem_size_bytes | windows_logical_disk_size_bytes | Missing Labels: (device, mountpoint, fstype) Additional label : (volume) |
| node_filesystem_free_bytes | windows_logical_disk_free_bytes | Missing Label: device, mountpoint, fstype) Additional label : (volume) |
| node_cpu_seconds_total | windows_cpu_time_total | Missing Label : cpu Additional Label: core |

**Pod Metrics:**

| Kubelet metrics | Windows Kubelet | Windows Exporter | Label Difference |
|----------------------------------------|----------------------|----------------------------------------------------------|-------------------------------------------------------------------------------------------------------------|
| kubelet_running_pods | kubelet_running_pods | windows_container_available | - |
| container_memory_working_set_bytes | - | windows_container_memory_usage_private_working_set_bytes | Missing Label: (image) Additional Label: (container_id) which is equivalent of (id) for Linux |
| container_cpu_usage_seconds_total | - | windows_container_cpu_usage_seconds_total | Missing Label: (image, metrics_path) Additional Label: (container_id) which is equivalent of (id) for Linux |
| container_fs_usage_bytes | - | - | |
| container_network_receive_bytes_total | - | windows_container_network_receive_bytes_total | Missing Label: (image, metrics_path) Additional Label: (container_id) which is equivalent of (id) for Linux |
| container_network_transmit_bytes_total | - | windows_container_network_transmit_bytes_total | Missing Label: (image, metrics_path) Additional Label: (container_id) which is equivalent of (id) for Linux |

#### Moving towards EndpointSlices

* Since the metrics Endpoints object is managed by WMCO, we plan to replace
Endpoints object with [EndpointSlices](https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#motivation)
to improve performance. This can be done once the `prometheus-operator` has
[support](https://github.com/prometheus-operator/prometheus-operator/issues/3862)
for EndpointSlices object.

#### Securing windows_exporter endpoint

* Since the windows-exporter is not running as a [pod](#justification), the
endpoint is not secure. The reason for this is when running inside a pod, we
can use CA signer for providing TLS cert/key to the service for
authentication. We plan to leverage windows_exporter's support for `https`
configuration. WMCO will be responsible for adding [web config](https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-configuration.md)
for TLS. This will ensure that the metrics Endpoint will be able to
authenticate the requests.

#### Telemetry Rules

* We plan to ensure that for [telemetry rules](https://docs.openshift.com/container-platform/4.7/support/remote_health_monitoring/showing-data-collected-by-remote-health-monitoring.html#showing-data-collected-from-the-cluster_showing-data-collected-by-remote-health-monitoring)
also use metrics from Windows. This can be done by renaming the Windows
metrics to align with metrics used in telemetry rules. For e.g.
`memory_usage_bytes:sum` rule uses `node_memory_MemTotal_bytes` that is
defined in the Windows rules. We also need to test if the existing telemetry
rules need to be updated similar to console queries, if they have Linux
specific queries. For e.g rules with `job=node-exporter` attribute.

### Test Plan

The current tests ensure that WMCO checks if :
* The operator namespace, `openshift-windows-machine-config-operator`, uses
`openshift.io/cluster-monitoring=true` label.
* Service, endpoints and Service Monitor objects are created as expected.
* Prometheus is able to collect data from windows_exporter.
* Custom Prometheus rules return Windows data.

The test plan for [future implementation](#future-plans)
will use existing tests to test creation of windows_exporter service and
metrics Service, Endpoints and Service Monitor objects. WMCO will also be
responsible for testing Prometheus rules created for Windows. We also
plan to add tests in console repo, that test the common recording rules and
ensure that they return results for Windows.

### Graduation Criteria

This enhancement will start as GA

### Upgrade / Downgrade Strategy

* WMCO is responsible for upgrading [windows_exporter](https://github.com/prometheus-community/windows_exporter/tags)
binary to the latest release. Downgrades are [not supported](https://github.com/operator-framework/operator-lifecycle-manager/issues/1177)
by OLM.

## Implementation History

v1: Initial Proposal

### Drawbacks

Running windows_exporter as a Windows service instead of running as a DaemonSet
pod makes it hard for the Prometheus to monitor Windows nodes. The
limitation of not able to run windows_exporter on Windows nodes as a pod is
because of support reasons as mentioned in the [WMCO_enhancement](https://github.com/openshift/enhancements/blob/master/enhancements/windows-containers/windows-machine-config-operator.md#justification).

0 comments on commit 86f3541

Please sign in to comment.