High cardinality http_server_* metrics from otelcol.receiver.zipkin
#4764
Labels
enhancement
New feature or request
frozen-due-to-age
Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.
What's wrong?
We've recently migrated to agent flow mode and in that process migrated to a setup with a single dedicated agent for ingesting traces. We ingest traces both via OTLP grpc and zipkin (and sometimes jaeger). We noticed that the scrape job to scrape metrics from the dedicated traces agent started becoming heavy, and found that it was due to some very high cardinality metrics exported by the zipkin receiver.
Example:
Notice the
http_client_ip
andnet_sock_peer_port
labels, which quickly explode. This seems to be due to this upstream issue: open-telemetry/opentelemetry-go-contrib#3765Even though we've configured our scrape job to drop all
http_server_*
metrics, the act of just parsing the/metrics
endpoint gradually becomes unmanageable as it grows asymptotically. Just now I tested with curl and found it to be 232Mb on our traces agent.I'm opening this issue with the hope that a workaround can be implemented in the grafana-agent until this issue has been fixed upstream.
Steps to reproduce
Run the agent in flow mode with an
otelcol.receiver.zipkin
component and start ingesting zipkin traces, and see thehttp_server_*
metrics exposed by the agent explode in cardinality.System information
Linux x86; GKE 1.24
Software version
Grafana Agent v0.35.0
Configuration
Logs
No response
The text was updated successfully, but these errors were encountered: