[connector/count] Break down counts by custom dimensions #19369

hannahchan · 2023-03-07T12:27:00Z

Component(s)

connector/count

Is your feature request related to a problem? Please describe.

The current implementation of the count connector doesn't allow us to breakdown a count metric by our own custom dimensions. Breaking down the count metrics will allow us to understand more about our telemetry.

I want to b able to specify a dimension such as an resource attribute named environment and then have count connector break out the trace.span.count for example by the values of the environment resource attribute. Below is an example of the time series I expect to be generated for example when I do this.

trace.span.count{service.name="myService" environment="prod"}
trace.span.count{service.name="myService" environment="staging"}
trace.span.count{service.name="myService" environment="dev"}

Some custom dimensions we want to specify include;

Environment
Receiver Type / Protocol / Format
Instrumentation Library

We would also like to see byte counts for logs, metrics and traces broken down by these dimensions.

Describe the solution you'd like

I would like to be able to specify the custom dimensions for the count connector in my collector's config and then have the count connector breakdown the counts according to my specified dimensions. I would also like to be able to enabled byte count metrics in my configuration as well.

Describe alternatives you've considered

We've considered building a processor that does this and emit the metrics to the collector's /metrics endpoint. Because you can order processors and have multiple of the same process in a pipeline, you can count the logs, metrics and traces at different stages of a pipeline.

Additional context

We have an established multi-region SaaS application that we are uplifting to use OpenTelemetry automatic instrumentation.

We know from experimentation in a test environment that the automatic instrumentation emits more telemetry than our previous instrumentation. However this doesn't help us estimate or understand the volume of telemetry that our application will emit in production upfront without incurring the full cost of transporting and storing that telemetry.

Our current thinking is to deploy and configure the OpenTelemetry collector in our production environment as a measuring device that can count the logs, metrics and traces that our application will emit and then discards that telemetry or pass it through to the next process.

The text was updated successfully, but these errors were encountered:

github-actions · 2023-03-07T12:27:20Z

Pinging code owners:

connector/count: @djaglowski @jpkrohling

See Adding Labels via Comments if you do not have permissions to add labels yourself.

djaglowski · 2023-03-07T14:06:38Z

This makes sense to me to an extent, but it's important that we keep in mind the extent to which data is already being dimensioned. Specifically, counts are already separated based on resource and scope. Therefore, some of the examples you've provided would not require any additional code.

service.name is typically a resource attribute, so we should already expect to see counts that are separated by this dimension. Similarly, Instrumentation library should be a scope-level attribute, so again I would expect counts to be already dimensioned accordingly.

We are not automatically dimensioning by lower level attributes (specifically span, span event, data point, and log record attributes), so it make sense to me that we could support these. What I am wondering is if we need to support anything other than attributes.

Here's how I'm imagining the config - does this look right to you?

count:
  spans:
    my.span.count:
      conditions:
        - 'resource.attributes["environment"] != nil'
      attributes: # span attributes to use as dimensions
        - environment
  spanevents:
    my.spanevent.count:
      conditions:
        - 'resource.attributes["environment"] != nil'
      attributes: # span event attributes to use as dimensions
        - environment
  logs:
    my.log.count:
      conditions:
        - 'resource.attributes["environment"] != nil'
      attributes: # log record attributes to use as dimensions
        - environment
  datapoints:
    my.datapoint.count:
      conditions:
        - 'resource.attributes["environment"] != nil'
      attributes: # data point attributes to use as dimensions
        - environment

  # metrics do not have attributes
  metrics:
    my.metric.count:

djaglowski · 2023-03-07T14:10:41Z

What is the behavior when an attribute that we are counting by does not exist? I can think of two options:

Don't increment any count
An autogenerated default value e.g. counts["_none"]++

hannahchan · 2023-03-07T22:02:02Z

An autogenerated default value e.g. counts["_none"]++

My default thought around this is to emit the count without that dimension.

hannahchan · 2023-03-07T22:04:40Z

What I am wondering is if we need to support anything other than attributes.

In my mind, it's only whatever is accessible in attributes. Can't think of a use case of anything otherwise. Your proposed configuration structure looks fine to me.

djaglowski · 2023-03-08T14:11:59Z

An autogenerated default value e.g. counts["_none"]++

My default thought around this is to emit the count without that dimension.

I'm not sure this is in the spirit of our metrics data model. As I understand it, a data point should represent a "total" value, except where dimensions are specified. If we are emitting a count that omits an attribute, this implies that it is the total count observed regardless of the attribute value.

There is more conversation about this here. Although the spec issue is not resolved, it seems to me there is consensus that a single emitter of telemetry should not emit metrics with multiple sets of attributes.

{ "name": "my.log.count", "value": 2, "attributes:{ "direction": "up" } }
{ "name": "my.log.count", "value": 3, "attributes:{ "direction": "down" } }

// equivalent to above
{ "name": "my.log.count", "value": 5, "attributes:{ } }

// not equivalent to above, but generated from logs without "direction" attribute
{ "name": "my.log.count", "value": 1, "attributes:{ } }

djaglowski · 2023-03-08T14:35:27Z

What is the behavior when at attribute that we are counting by does not exist? I can think of two options:

Don't increment any count

An autogenerated default value e.g. counts["_none"]++

Having thought about this more, I think the correct behavior is neither of these options. I suggest that we should provide an optional setting in the attributes configuration:

logs:
  my.log.count:
    conditions:
      - 'resource.attributes["environment"] != nil'
    attributes:
      - key: environment
        default: other # logs observed that did not contain the "environment" attribute

If the default is not specified, we should ignore data that does not contain the attribute. On the other hand, when it is specified, we can count and assign to an unambiguous value.

It could be argued that conditions can be used on filter the data appropriately to obtain a count for "other", e.g. generate a metric for resource.attributes["environment"] == nil. However, this would require that the count is associated with a different metric. It also requires a more complex configuration compared to the above:

logs:
  my.log.count:
    conditions:
      - 'resource.attributes["environment"] != nil'
    attributes:
      - environment
  my.log.missing_environment:
    conditions:
      - 'resource.attributes["environment"] == nil'

djaglowski · 2023-03-08T15:19:41Z

I'm going to assign this to myself and will try to have a PR soon.

github-actions · 2023-05-08T03:33:44Z

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

connector/count: @djaglowski @jpkrohling

See Adding Labels via Comments if you do not have permissions to add labels yourself.

hannahchan · 2023-05-08T03:40:40Z

This is still relevant to me. I've been on leave and haven't keep up-to-date regarding development to this feature but I can see some changes in code. I'll try to find time to explore this again for myself.

djaglowski · 2023-05-08T13:36:51Z

I believe this is effectively resolved by #19432. @hannahchan, thanks for taking an interest in this. Please let me know if you still need additional enhancements.

hannahchan added enhancement New feature or request needs triage New item requiring triage labels Mar 7, 2023

github-actions bot added the connector/count label Mar 7, 2023

atoulme removed the needs triage New item requiring triage label Mar 7, 2023

djaglowski self-assigned this Mar 8, 2023

github-actions bot added the Stale label May 8, 2023

djaglowski closed this as completed May 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[connector/count] Break down counts by custom dimensions #19369

[connector/count] Break down counts by custom dimensions #19369

hannahchan commented Mar 7, 2023

github-actions bot commented Mar 7, 2023

djaglowski commented Mar 7, 2023

djaglowski commented Mar 7, 2023 •

edited

Loading

hannahchan commented Mar 7, 2023

hannahchan commented Mar 7, 2023

djaglowski commented Mar 8, 2023

djaglowski commented Mar 8, 2023

djaglowski commented Mar 8, 2023

github-actions bot commented May 8, 2023

hannahchan commented May 8, 2023

djaglowski commented May 8, 2023

[connector/count] Break down counts by custom dimensions #19369

[connector/count] Break down counts by custom dimensions #19369

Comments

hannahchan commented Mar 7, 2023

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Mar 7, 2023

djaglowski commented Mar 7, 2023

djaglowski commented Mar 7, 2023 • edited Loading

hannahchan commented Mar 7, 2023

hannahchan commented Mar 7, 2023

djaglowski commented Mar 8, 2023

djaglowski commented Mar 8, 2023

djaglowski commented Mar 8, 2023

github-actions bot commented May 8, 2023

hannahchan commented May 8, 2023

djaglowski commented May 8, 2023

djaglowski commented Mar 7, 2023 •

edited

Loading