diff --git a/processor/metricstransformprocessor/README.md b/processor/metricstransformprocessor/README.md index ea6cd5e87b40..c24a6a582b75 100644 --- a/processor/metricstransformprocessor/README.md +++ b/processor/metricstransformprocessor/README.md @@ -1,94 +1,134 @@ # Metrics Transform Processor + Supported pipeline types: metrics -- This ONLY supports renames/aggregations **within individual metrics**. It does not do any aggregation across batches, so it is not suitable for aggregating metrics from multiple sources (e.g. multiple nodes or clients). At this point, it is only for aggregating metrics from a single source that groups its metrics for a particular time period into a single batch (e.g. host metrics from the VM the collector is running on). -- Rename Collisions will result in a no operation on the metrics data - - e.g. If want to rename a metric or label to `new_name` while there is already a metric or label called `new_name`, this operation will not take any effect. There will also be an error logged ## Description -The metrics transform processor can be used to rename metrics, labels, or label values. It can also be used to perform aggregations on metrics across labels or label values. - -## Capabilities -- Rename metrics (e.g. rename `cpu/usage` to `cpu/usage_time`) -- Rename labels (e.g. rename `cpu` to `core`) -- Rename label values (e.g. rename `done` to `complete`) -- Aggregate across label sets (e.g. only want the label `usage`, but don’t care about the labels `core`, and `cpu`) - - Aggregation_type: sum, mean, max -- Aggregate across label values (e.g. want `memory{slab}`, but don’t care about `memory{slab_reclaimable}` & `memory{slab_unreclaimable}`) - - Aggregation_type: sum, mean, max -- Add label to an existing metric -- When adding or updating a label value, specify `{{version}}` to include the application version number + +The metrics transform processor can be used to rename metrics, and add, rename or delete label keys and values. It can also be used to perform aggregations on metrics across labels or label values. The complete list of supported operations that can be applied to one or more metrics is provided in the below table. + +:information_source: This processor only supports renames/aggregations **within a batch of metrics**. It does not do any aggregation across batches, so it is not suitable for aggregating metrics from multiple sources (e.g. multiple nodes or clients). + +| Operation | Example | +| --- | --- | +| Rename metrics | Rename `system.cpu.usage` to `system.cpu.usage_time` | +| Add labels | For metric `system.cpu.usage`, add new label `identifier` with value `1` to all data points | +| Rename label keys | For metric `system.cpu.usage`, rename label `state` to `cpu_state` | +| Rename label values | For metric `system.cpu.usage`, label `state`, rename label value `idle` to `-` | +| Delete label values | For metric `system.cpu.usage`, delete all data points where label `state` has value `idle` | +| Toggle the data type of scalar metrics between `int` and `double` | For metric `system.cpu.usage`, change from `int` data points to `double` data points | +| Aggregate across label sets by sum, mean, min, or max | For metric `system.cpu.usage`, retain the label `state`, but aggregate away the label `cpu` | +| Aggregate across label values by sum, mean, min, or max | For metric `system.cpu.usage`, label `state`, calculate `used = sum{user + system}` | + +In addition to the above, when adding or updating a label value, specify `{{version}}` to include the application version number ## Configuration + +Configuration is specified through a list of transformations and operations. Transformations and operations will be applied to all metrics in order so that later transformations or operations may reference the result of previous transformations or operations. + ```yaml # transforms is a list of transformations with each element transforming a metric selected by metric name transforms: - # name is used to match with the metric to operate on. This implementation doesn’t utilize the filtermetric’s MatchProperties struct because it doesn’t match well with what I need at this phase. All is needed for this processor at this stage is a single name string that can be used to match with selected metrics. The list of metric names and the match type in the filtermetric’s MatchProperties struct are unnecessary. Also, based on the issue about improving filtering configuration, it seems like this struct is subject to be slightly modified. - - metric_name: - - # action specifies if the operations are performed on the current copy of the metric or on a newly created metric that will be inserted + # include specifies the metric name used to determine which metric(s) to operate on + - include: + # match_type specifies whether the include name should be used as a strict match or regexp match, default = strict + match_type: {strict, regexp} + # action specifies if the operations are performed on the metric in place, or on an inserted clone action: {update, insert} - - # new_name is used to rename metrics (e.g. rename cpu/usage to cpu/usage_time) if action is insert, new_name is required + # new_name specifies the updated name of the metric; if action is insert, new_name is required new_name: - - # operations contain a list of operations that will be performed on the selected metrics. Each operation block is a key-value pair, where the key can be any arbitrary string set by the users for readability, and the value is a struct with fields required for operations. The action field is important for the processor to identify exactly which operation to perform + # operations contain a list of operations that will be performed on the selected metrics operations: - - # update_label action can be used to update the name of a label or the values of this label (e.g. rename label `cpu` to `core`) - - action: update_label - label: - new_label: - value_actions: - - value: - new_value: - - # aggregate_labels action aggregates metrics across labels (e.g. only want the label `usage`, but don’t care about the labels `core`, and `cpu`) - - action: aggregate_labels - # label_set contains a list of labels that will remain after the aggregation. The excluded labels will be aggregated by the way specified by aggregation_type. - label_set: [labels...] - aggregation_type: {sum, mean, max} - - # aggregate_label_values action aggregates labels across label values (e.g. want memory{slab}, but don’t care about memory{slab_reclaimable} & memory{slab_unreclaimable}) - - action: aggregate_label_values - label: