- Install Prometheus Operator in your cluster:
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update
$ helm install prometheus prometheus-community/kube-prometheus-stack
-
Enable metrics by configuring
controller.enableMetrics
andnode.enableMetrics
. -
Deploy EBS CSI Driver:
$ helm upgrade --install aws-ebs-csi-driver --namespace kube-system ./charts/aws-ebs-csi-driver --values ./charts/aws-ebs-csi-driver/values.yaml
Installing the Prometheus Operator and enabling metrics will deploy a Service object that exposes the EBS CSI Driver's controller metric port through a ClusterIP
. Additionally, a ServiceMonitor object is deployed which updates the Prometheus scrape configuration and allows scraping metrics from the endpoint defined. For more information, see the manifest metrics.yaml
The EBS CSI Driver will emit AWS API metrics to the following TCP endpoint: 0.0.0.0:3301/metrics
if controller.enableMetrics: true
has been configured in the Helm chart.
The metrics will appear in the following format:
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="0.005"} 0
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="0.01"} 0
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="0.025"} 0
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="0.05"} 0
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="0.1"} 0
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="0.25"} 0
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="0.5"} 0
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="1"} 1
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="2.5"} 1
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="5"} 1
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="10"} 1
aws_ebs_csi_api_request_duration_seconds_bucket{request="AttachVolume",le="+Inf"} 1
aws_ebs_csi_api_request_duration_seconds_sum{request="AttachVolume"} 0.547694574
aws_ebs_csi_api_request_duration_seconds_count{request="AttachVolume"} 1
...
By default, the driver deploys 2 replicas of the controller pod. However, each CSI sidecar (such as the attacher and resizer) uses a leader election mechanism to designate one leader pod per sidecar.
To manually scrape metrics for specific operations, you must identify and target the leader pod for the relevant sidecar. As an example, to manually scrape metrics for AttachVolume operations (handled by the external attacher), follow these steps:
$ export ebs_csi_attacher_leader=$(kubectl get lease external-attacher-leader-ebs-csi-aws-com -n kube-system -o=jsonpath='{.spec.holderIdentity}')
$ kubectl port-forward $ebs_csi_attacher_leader 3301:3301 -n kube-system &
$ curl 127.0.0.1:3301/metrics
The EBS CSI Driver will emit container storage Interface managed devices metrics to the following TCP endpoint: 0.0.0.0:3302/metrics
if node.enableMetrics: true
has been configured in the Helm chart.
The metrics will appear in the following format:
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="0.001"} 0
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="0.0025"} 0
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="0.005"} 1
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="0.01"} 1
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="0.025"} 1
nvme_collector_duration_seconds_bucket{instance_id="instance-id}",le="0.05"} 1
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="0.1"} 1
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="0.25"} 1
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="0.5"} 1
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="1"} 1
nvme_collector_duration_seconds_bucket{instance_id="instance-id}",le="2.5"} 1
nvme_collector_duration_seconds_bucket{instance_id="instance-id}",le="5"} 1
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="10"} 1
nvme_collector_duration_seconds_bucket{instance_id="{instance-id}",le="+Inf"} 1
...
To manually scrape AWS metrics:
$ kubectl port-forward $ebs_csi_node_pod_name 3302:3302 -n kube-system
$ curl 127.0.0.1:3302/metrics
The EBS CSI Driver emits Kubelet mounted volume metrics for volumes created with the driver.
The following metrics are currently supported:
Metric name | Metric type | Description | Labels |
---|---|---|---|
kubelet_volume_stats_capacity_bytes | Gauge | The capacity in bytes of the volume | namespace=<persistentvolumeclaim-namespace> persistentvolumeclaim=<persistentvolumeclaim-name> |
kubelet_volume_stats_available_bytes | Gauge | The number of available bytes in the volume | namespace=<persistentvolumeclaim-namespace> persistentvolumeclaim=<persistentvolumeclaim-name> |
kubelet_volume_stats_used_bytes | Gauge | The number of used bytes in the volume | namespace=<persistentvolumeclaim-namespace> persistentvolumeclaim=<persistentvolumeclaim-name> |
kubelet_volume_stats_inodes | Gauge | The maximum number of inodes in the volume | namespace=<persistentvolumeclaim-namespace> persistentvolumeclaim=<persistentvolumeclaim-name> |
kubelet_volume_stats_inodes_free | Gauge | The number of free inodes in the volume | namespace=<persistentvolumeclaim-namespace> persistentvolumeclaim=<persistentvolumeclaim-name> |
kubelet_volume_stats_inodes_used | Gauge | The number of used inodes in the volume | namespace=<persistentvolumeclaim-namespace> persistentvolumeclaim=<persistentvolumeclaim-name> |
For more information about the supported metrics, see VolumeUsage
within the CSI spec documentation for the NodeGetVolumeStats RPC call.
For more information about metrics in Kubernetes, see the Metrics For Kubernetes System Components documentation.
The csi_operations_seconds metrics
reports a latency histogram of kubelet-initiated CSI gRPC calls by gRPC status code.
To manually scrape Kubelet metrics:
$ kubectl proxy
$ kubectl get --raw /api/v1/nodes/<insert_node_name>/proxy/metrics