istio · istio-testing · Feb 17, 2021 · Feb 10, 2021 · Feb 10, 2021 · Feb 10, 2021
@@ -556,6 +556,7 @@ prepending
 prepends
 prober
 programmatically
+PromQL
 proto
 protobuf
 protoc

@@ -0,0 +1,18 @@
+---
+title: How can I manage short-lived metrics?
+weight: 20
+---
+
+Short-lived metrics can hamper the performance of Prometheus, as they often are a large source of label cardinality. Cardinality is a measure of the number of unique values for a label. To manage the impact of your short-lived metrics on Prometheus, you must first identify the high cardinality metrics and labels. Prometheus provides cardinality information at its `/status` page. Additional information can be retrieved [via PromQL](https://www.robustperception.io/which-are-my-biggest-metrics).
+There are several ways to reduce the cardinality of Istio metrics:
+
+* Disable host header fallback.
+  The `destination_service` label is one potential source of high-cardinality.
+  The values for `destination_service` default to the host header if the Istio proxy is not able to determine the destination service from other request metadata.
+  If clients are using a variety of host headers, this could result in a large number of values for the  `destination_service`.
+  For such case, follow [metric customization](https://istio.io/latest/docs/tasks/observability/metrics/customize-metrics/) guide to disable host header fallback mesh wide.
+  To disable host header fallback for a particular workload or namespace, you need to copy the stats `EnvoyFilter` configuration, update it to have host header fallback disabled, and apply it with a more specific selector.
+  [This issue](https://github.com/istio/istio/issues/25963#issuecomment-666037411) has more detail on how to achieve this.
+* Drop unnecessary labels from collection. If the label with high cardinality is not needed, you can drop is from metric collection via [metric customization](/docs/tasks/observability/metrics/customize-metrics/) using `tags_to_remove`.
+* Normalize label values, either through federation or classification.
+  If the information provided by the label is desired, you can use [Prometheus federation](/docs/ops/best-practices/observability/#using-prometheus-for-production-scale-monitoring) or [request classification](/docs/tasks/observability/metrics/classify-metrics/) to normalize the label.
@@ -41,8 +41,9 @@ v2 which are listed below:
 * **No metric expiration for short-lived metrics**
   Mixer-based telemetry supported metric expiration whereby metrics which were
   not generated for a configurable amount of time were de-registered for
-  collection by Prometheus. This is useful in scenarios where short-lived jobs
-  surface telemetry only for a short amount of time, and de-registering
+  collection by Prometheus. This is useful in scenarios where short-lived metrics
+  only surface for a short amount of time, and de-registering
   the metrics prevents reporting of metrics which would no longer change in the
   future, thereby reducing network traffic and storage in Prometheus.
   This expiration mechanism is not available in in-proxy telemetry.
+  The workaround for this can be found [here](/faq/metrics-and-logs/#metric-expiry).
-Original file line number
+Diff line change
@@ Expand Up / @@ -556,6 +556,7 @@ prepending @@
     prepends
     prober
     programmatically
+    PromQL
     proto
     protobuf
     protoc
@@ Expand Down @@