Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the histogram metrics by default #253

Merged
merged 2 commits into from
Jun 16, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions collector/consumer/exporter/otelexporter/consume.go
Original file line number Diff line number Diff line change
Expand Up @@ -88,9 +88,9 @@ func (e *OtelExporter) exportMetric(result *adapter.AdaptedResult) {
} else if ok && metric.DataType() == model.IntMetricType {
measurements = append(measurements, e.instrumentFactory.getInstrument(metric.Name, metricKind).Measurement(metric.GetInt().Value))
} else if metric.DataType() == model.HistogramMetricType {
e.telemetry.Logger.Error("Failed to exporter Metric: can not use otlp-exporter to export histogram Data", zap.String("MetricName", metric.Name))
e.telemetry.Logger.Warn("Failed to exporter Metric: can not use otlp-exporter to export histogram Data", zap.String("MetricName", metric.Name))
} else {
e.telemetry.Logger.Warn("Undefined metricKind for this Metric", zap.String("MetricName", metric.Name), zap.String("MetricType", reflect.TypeOf(metric).String()))
e.telemetry.Logger.Debug("Undefined metricKind for this Metric", zap.String("MetricName", metric.Name), zap.String("MetricType", reflect.TypeOf(metric).String()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use below exp to replace logger.Debug(), see #142

if ce := p.telemetry.Logger.Check(zapcore.DebugLevel, "Undefined metricKind for this Metric"); ce != nil {
	ce.Write(zap.String("MetricName", metric.Name), zap.String("MetricType", reflect.TypeOf(metric).String()))
}

}
}
if len(measurements) > 0 {
Expand Down
2 changes: 0 additions & 2 deletions collector/docker/kindling-collector-config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,12 +122,10 @@ exporters:
metric_aggregation_map:
kindling_entity_request_total: counter
kindling_entity_request_duration_nanoseconds_total: counter
kindling_entity_request_average_duration_nanoseconds: histogram
kindling_entity_request_send_bytes_total: counter
kindling_entity_request_receive_bytes_total: counter
kindling_topology_request_total: counter
kindling_topology_request_duration_nanoseconds_total: counter
kindling_topology_request_average_duration_nanoseconds: histogram
kindling_topology_request_request_bytes_total: counter
kindling_topology_request_response_bytes_total: counter
kindling_trace_request_duration_nanoseconds: gauge
Expand Down
2 changes: 0 additions & 2 deletions deploy/agent/kindling-collector-config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,12 +122,10 @@ exporters:
metric_aggregation_map:
kindling_entity_request_total: counter
kindling_entity_request_duration_nanoseconds_total: counter
kindling_entity_request_average_duration_nanoseconds: histogram
kindling_entity_request_send_bytes_total: counter
kindling_entity_request_receive_bytes_total: counter
kindling_topology_request_total: counter
kindling_topology_request_duration_nanoseconds_total: counter
kindling_topology_request_average_duration_nanoseconds: histogram
NeJan2020 marked this conversation as resolved.
Show resolved Hide resolved
kindling_topology_request_request_bytes_total: counter
kindling_topology_request_response_bytes_total: counter
kindling_trace_request_duration_nanoseconds: gauge
Expand Down
29 changes: 23 additions & 6 deletions docs/prometheus_metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ Service metrics are generated from the server-side events, which are used to sho
| `kindling_entity_request_duration_nanoseconds_total` | Counter | Total duration of requests |
| `kindling_entity_request_send_bytes_total` | Counter | Total size of payload sent |
| `kindling_entity_request_receive_bytes_total` | Counter | Total size of payload received |
| `kindling_entity_request_average_duration_nanoseconds_count` | Histogram | Count of average duration of requests |
| `kindling_entity_request_average_duration_nanoseconds_sum` | Histogram | Sum of average duration of requests |
| `kindling_entity_request_average_duration_nanoseconds_bucket` | Histogram | Histogram buckets of average duration of requests |
| `kindling_entity_request_average_duration_nanoseconds_count` | Histogram | Count of average duration of requests <br> **Disabled by default. See Note 3 for how to enable it.**|
| `kindling_entity_request_average_duration_nanoseconds_sum` | Histogram | Sum of average duration of requests <br> **Disabled by default. See Note 3 for how to enable it.**|
| `kindling_entity_request_average_duration_nanoseconds_bucket` | Histogram | Histogram buckets of average duration of requests <br> **Disabled by default. See Note 3 for how to enable it.**|
### Labels List
| **Label Name** | **Example** | **Notes** |
| --- | --- | --- |
Expand Down Expand Up @@ -70,6 +70,15 @@ Service metrics are generated from the server-side events, which are used to sho

- For other cases, the `request_content` and `response_content` are both empty.

**Note 3**: The histogram metric `kindling_entity_request_average_duration_nanoseconds_*` is disabled by default as it could be high-cardinality. If this metric is needed, please add a new line to the `exporters.otelexporter.metric_aggregation_map` section of the configuration file.
```yaml
exporters:
otelexporter:
metric_aggregation_map:
# add the following line
kindling_entity_request_average_duration_nanoseconds: histogram
```

## Topology Metrics

Topology metrics are typically generated from the client-side events, which are used to show the service dependencies map, so the metrics are called "topology". Some timeseries may be generated from the server-side events, which contain a non-empty label `dst_container_id`. These timeseries are generated only when the source IP is not the pod's IP inside the Kubernetes cluster, which are useful when there is no agent installed on the client-side.
Expand All @@ -81,9 +90,9 @@ Topology metrics are typically generated from the client-side events, which are
| `kindling_topology_request_duration_nanoseconds_total` | Counter | Total duration of requests |
| `kindling_topology_request_request_bytes_total` | Counter | Total size of payload sent |
| `kindling_topology_request_response_bytes_total` | Counter | Total size of payload received |
| `kindling_topology_request_average_duration_nanoseconds_count` | Histogram | Count of average duration of requests |​
| `kindling_topology_request_average_duration_nanoseconds_sum` | Histogram | Sum of average duration of requests |
| `kindling_topology_request_average_duration_nanoseconds_bucket` | Histogram | Histogram buckets of average duration of requests |
| `kindling_topology_request_average_duration_nanoseconds_count` | Histogram | Count of average duration of requests<br> **Disabled by default. See Note 3 for how to enable it.** |​
| `kindling_topology_request_average_duration_nanoseconds_sum` | Histogram | Sum of average duration of requests<br> **Disabled by default. See Note 3 for how to enable it.** |
| `kindling_topology_request_average_duration_nanoseconds_bucket` | Histogram | Histogram buckets of average duration of requests<br> **Disabled by default. See Note 3 for how to enable it.** |

### Labels List
| **Label Name** | **Example** | **Notes** |
Expand Down Expand Up @@ -125,6 +134,14 @@ These two terms are composed of two parts.
- **DUBBO**: 'Error Code' of Dubbo request.
- **others**: empty temporarily

**Note 3**: The histogram metric `kindling_topology_request_average_duration_nanoseconds_*` is disabled by default as it could be high-cardinality. If this metric is needed, please add a new line to the `exporters.otelexporter.metric_aggregation_map` section of the configuration file.
```yaml
exporters:
otelexporter:
metric_aggregation_map:
# add the following line
kindling_topology_request_average_duration_nanoseconds: histogram
```
## Trace As Metric
We made some rules for considering whether a request is abnormal. For the abnormal request, the detail request information is considered as useful for debugging or profiling. We name this kind of data "trace". It is not a good practice to store such data in Prometheus as some labels are high-cardinality, so we picked up some labels from the original ones to generate a new kind of metric, which is called "Trace As Metric". The following table shows what labels this metric contains.

Expand Down