Service Bus Aggregation missing one interval #1563

mtarantino · 2021-03-17T11:49:19Z

Report

Using the latest docker image, I'm scrapping Service Bus metrics and counting messages in a Queue/Topic. The aggregation is not properly computed according to my expectation.
Details:
I'm using metricDefaults.aggregation.interval of 00:05:00 and metricDefaults.scraping.schedule of 0 * * ? * * .

Expected Behavior

My queue has a constant sized of 4 messages in the queue. I expected the aggregation to be for:

Average: 4
Minimum: 4
Total: 20

Actual Behavior

My queue has a constant sized of 4 messages in the queue. I see the following aggregation:

Average: 3.2
Min: 0
Total: 16
Even after running the scrapper for 10 minutes, the results doesn't change. It's as if the aggregation was missing one value (e.g. [4-4-4-4-0] instead of [4-4-4-4-4]). When I'm using Maximum the value is correct.

Steps to Reproduce the Problem

Publish 4 messages in an Azure Service Bus queue (without consuming them)
Configure promitor to scrap metrics from the queue
Run the docker image

docker run -p 8080:8080 \
   --env-file az-mon-auth.creds \
   --volume /mypath/metrics-declaration.yaml:/config/metrics-declaration.yaml \
   --volume /mypath/runtime.yaml:/config/runtime.yaml \
   ghcr.io/tomkerkhove/promitor-agent-scraper:2.1.1

Component

Scraper

Version

2.1.1

Configuration

runtime.yaml

---
server:
  httpPort: 8080 # Optional. Default: 80
metricSinks:
  prometheusScrapingEndpoint:
    metricUnavailableValue: NaN # Optional. Default: NaN
    enableMetricTimestamps: false # Optional. Default: true
    baseUriPath: /metrics # Optional. Default: /metrics
metricsConfiguration:
  absolutePath: /config/metrics-declaration.yaml # Optional. Default: /config/metrics-declaration.yaml
telemetry:
  containerLogs:
    isEnabled: true # Optional. Default: true
    verbosity: trace # Optional. Default: N/A
  defaultVerbosity: Trace # Optional. Default: error
azureMonitor:
  logging:
    informationLevel: Basic # Optional. Default: Basic
    isEnabled: true # Optional. Default: false

metrics-declaration.yaml

metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: "0 * * * * *"
metrics:
  - name: azure_service_bus_messages
    description: "Count of messages in a Queue/Topic."
    resourceType: ServiceBusNamespace
    labels:
      env: dev
    azureMetricConfiguration:
      metricName: Messages
      aggregation:
        type: Average
    resources:
      - namespace: myNamespace
        topicName: myTopic

Logs

Unfortunately I despite enabling azureMonitor in my runtime configuration I was not able to get the Azure Monitor API call logged (not sure why...)
For Average:

[11:40:00 INF] Scraping Azure Monitor - 03/17/2021 11:40:00 +00:00
[11:40:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:40:00 INF] Adding Prometheus sink to expose on /metrics
[11:40:00 DBG] Failed to locate the development https certificate at 'null'.
[11:40:00 INF] Now listening on: http://[::]:8080
[11:40:00 INF] Application started. Press Ctrl+C to shut down.
[11:40:00 INF] Hosting environment: Production
[11:40:00 INF] Content root path: /app
[11:40:02 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:41:00 INF] Scraping Azure Monitor - 03/17/2021 11:41:00 +00:00
[11:41:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:41:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:42:00 INF] Scraping Azure Monitor - 03/17/2021 11:42:00 +00:00
[11:42:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:42:00 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:43:00 INF] Scraping Azure Monitor - 03/17/2021 11:43:00 +00:00
[11:43:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:43:00 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:44:00 INF] Scraping Azure Monitor - 03/17/2021 11:44:00 +00:00
[11:44:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:44:00 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:45:00 INF] Scraping Azure Monitor - 03/17/2021 11:45:00 +00:00
[11:45:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:45:00 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:46:00 INF] Scraping Azure Monitor - 03/17/2021 11:46:00 +00:00
[11:46:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:46:00 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:47:00 INF] Scraping Azure Monitor - 03/17/2021 11:47:00 +00:00
[11:47:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:47:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:48:00 INF] Scraping Azure Monitor - 03/17/2021 11:48:00 +00:00
[11:48:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:48:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00

For Total

[11:16:05 INF] Scraping Azure Monitor - 03/17/2021 11:16:05 +00:00
[11:16:05 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:16:05 INF] Adding Prometheus sink to expose on /metrics
[11:16:05 DBG] Failed to locate the development https certificate at 'null'.
[11:16:05 INF] Now listening on: http://[::]:8080
[11:16:05 INF] Application started. Press Ctrl+C to shut down.
[11:16:05 INF] Hosting environment: Production
[11:16:05 INF] Content root path: /app
[11:16:06 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:17:00 INF] Scraping Azure Monitor - 03/17/2021 11:17:00 +00:00
[11:17:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:17:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:18:00 INF] Scraping Azure Monitor - 03/17/2021 11:18:00 +00:00
[11:18:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:18:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:19:00 INF] Scraping Azure Monitor - 03/17/2021 11:19:00 +00:00
[11:19:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:19:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:20:00 INF] Scraping Azure Monitor - 03/17/2021 11:20:00 +00:00
[11:20:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:20:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:21:00 INF] Scraping Azure Monitor - 03/17/2021 11:21:00 +00:00
[11:21:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:21:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:22:00 INF] Scraping Azure Monitor - 03/17/2021 11:22:00 +00:00
[11:22:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:22:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:23:00 INF] Scraping Azure Monitor - 03/17/2021 11:23:00 +00:00
[11:23:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:23:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:24:00 INF] Scraping Azure Monitor - 03/17/2021 11:24:00 +00:00
[11:24:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:24:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:25:00 INF] Scraping Azure Monitor - 03/17/2021 11:25:00 +00:00
[11:25:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:25:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:26:00 INF] Scraping Azure Monitor - 03/17/2021 11:26:00 +00:00
[11:26:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:26:01 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:27:00 INF] Scraping Azure Monitor - 03/17/2021 11:27:00 +00:00
[11:27:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:27:01 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:28:00 INF] Scraping Azure Monitor - 03/17/2021 11:28:00 +00:00
[11:28:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:28:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00
[11:29:00 INF] Scraping Azure Monitor - 03/17/2021 11:29:00 +00:00
[11:29:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[11:29:00 INF] Found value 16 for metric azure_service_bus_messages with aggregation interval 00:05:00

Platform

Other

Contact Details

tarantino.dev@gmail.com

The text was updated successfully, but these errors were encountered:

tomkerkhove · 2021-03-17T11:51:11Z

Thanks for reporting. Is it possible to add screenshot of the metrics explorer in the Azure portal for these metrics please?

mtarantino · 2021-03-17T13:39:59Z

@tomkerkhove, attached a screenshot of the metrics explorer on Azure portal:

Another example with the following metricDefaults:

metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: "0 * * ? * *"

Logs:

[13:28:57 INF] Scraping Azure Monitor - 03/17/2021 13:28:57 +00:00
[13:28:57 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:28:57 INF] Adding Prometheus sink to expose on /metrics
[13:28:57 DBG] Failed to locate the development https certificate at 'null'.
[13:28:57 INF] Now listening on: http://[::]:8080
[13:28:57 INF] Application started. Press Ctrl+C to shut down.
[13:28:57 INF] Hosting environment: Production
[13:28:57 INF] Content root path: /app
[13:29:00 INF] Found value 4 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:29:00 INF] Scraping Azure Monitor - 03/17/2021 13:29:00 +00:00
[13:29:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:29:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:30:00 INF] Scraping Azure Monitor - 03/17/2021 13:30:00 +00:00
[13:30:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:30:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:31:00 INF] Scraping Azure Monitor - 03/17/2021 13:31:00 +00:00
[13:31:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:31:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:32:00 INF] Scraping Azure Monitor - 03/17/2021 13:32:00 +00:00
[13:32:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:32:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:33:00 INF] Scraping Azure Monitor - 03/17/2021 13:33:00 +00:00
[13:33:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:33:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:34:00 INF] Scraping Azure Monitor - 03/17/2021 13:34:00 +00:00
[13:34:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:34:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:35:00 INF] Scraping Azure Monitor - 03/17/2021 13:35:00 +00:00
[13:35:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:35:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00

As you can see:

[13:28:57 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[13:29:00 INF] Found value 4 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:29:00 INF] Scraping Azure Monitor - 03/17/2021 13:29:00 +00:00
[13:29:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:29:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00

First scrapping is returning the proper value 4 and then it's returning 3.2.

I though that it might be cause the 2 api calls are too close, but when trying with a fixed time in the schedule (so the time between the first scrap which I guess is part of a kind of initialisation and the next one is greater than 3 minutes, the results is the same)

tomkerkhove · 2021-03-17T13:55:52Z

Hm odd, is the metric explorer also using 00:05:00 as aggregation?

Frankly, we can only scrape the API with the same configuration but there is no data manipulation that we are doing.

mtarantino · 2021-03-17T14:15:49Z

Hm odd, is the metric explorer also using 00:05:00 as aggregation?

This is the metrics configuration:

metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: "0 * * ? * *"
metrics:
  - name: azure_service_bus_messages
    description: "Count of messages in a Queue/Topic."
    resourceType: ServiceBusNamespace
    labels:
      env: dev
    azureMetricConfiguration:
      metricName: Messages
      aggregation:
        type: Average
    resources:
      - namespace: myNamespace
        topicName: myTopic

I would need to dig into the api call, I saw in runtime/scraper#azure-monitor that's it's possible to get the Azure Monitor API call but didn't manage to make it work with my configuration, is there something wrong with the following config:

server:
  httpPort: 8080 # Optional. Default: 80
metricSinks:
  prometheusScrapingEndpoint:
    metricUnavailableValue: NaN # Optional. Default: NaN
    enableMetricTimestamps: false # Optional. Default: true
    baseUriPath: /metrics # Optional. Default: /metrics
metricsConfiguration:
  absolutePath: /config/metrics-declaration.yaml # Optional. Default: /config/metrics-declaration.yaml
telemetry:
  containerLogs:
    isEnabled: true # Optional. Default: true
    verbosity: trace # Optional. Default: N/A
  defaultVerbosity: Trace # Optional. Default: error
azureMonitor:
  logging:
    informationLevel: BodyAndHeaders # Optional. Default: Basic
    isEnabled: true # Optional. Default: false

tomkerkhove · 2021-03-17T19:36:41Z

I was more referring to the aggregation in te metric explorer of the Azure Portal, can you check please?

mtarantino · 2021-03-18T07:54:12Z

I was more referring to the aggregation in the metric explorer of the Azure Portal, can you check please?

@tomkerkhove : Ah, yes the aggregation in the Azure Portal Metric explorer (called "time granularity" on the interface) is set to 5 minutes.

I've done some additional test and let promitor runs over night with the metrics configuration

metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: "0 * * * * *"
metrics:
  - name: azure_service_bus_messages
    description: "Count of messages in a Queue/Topic."
    resourceType: ServiceBusNamespace
    labels:
      env: dev
    azureMetricConfiguration:
      metricName: Messages
      aggregation:
        type: Total
    resources:
      - namespace: myNamespace
        topicName: myTopic

This is the Metric Explorer screenshot:

And this are the logs, we can see that the total is oscillating between 20 and 25 despite the fact that the amount of messages is fixed to 5 (the proper total should be 25):

[07:20:00 INF] Scraping Azure Monitor - 03/18/2021 07:20:00 +00:00
[07:20:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:20:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:21:00 INF] Scraping Azure Monitor - 03/18/2021 07:21:00 +00:00
[07:21:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:21:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:22:00 INF] Scraping Azure Monitor - 03/18/2021 07:22:00 +00:00
[07:22:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:22:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:23:00 INF] Scraping Azure Monitor - 03/18/2021 07:23:00 +00:00
[07:23:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:23:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:24:00 INF] Scraping Azure Monitor - 03/18/2021 07:24:00 +00:00
[07:24:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:24:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:25:00 INF] Scraping Azure Monitor - 03/18/2021 07:25:00 +00:00
[07:25:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:25:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:26:00 INF] Scraping Azure Monitor - 03/18/2021 07:26:00 +00:00
[07:26:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:26:00 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:27:00 INF] Scraping Azure Monitor - 03/18/2021 07:27:00 +00:00
[07:27:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:27:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:28:00 INF] Scraping Azure Monitor - 03/18/2021 07:28:00 +00:00
[07:28:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:28:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:29:00 INF] Scraping Azure Monitor - 03/18/2021 07:29:00 +00:00
[07:29:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:29:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:30:00 INF] Scraping Azure Monitor - 03/18/2021 07:30:00 +00:00
[07:30:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:30:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:31:00 INF] Scraping Azure Monitor - 03/18/2021 07:31:00 +00:00
[07:31:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:31:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:32:00 INF] Scraping Azure Monitor - 03/18/2021 07:32:00 +00:00
[07:32:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:32:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:33:00 INF] Scraping Azure Monitor - 03/18/2021 07:33:00 +00:00
[07:33:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:33:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:34:00 INF] Scraping Azure Monitor - 03/18/2021 07:34:00 +00:00
[07:34:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:34:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:35:00 INF] Scraping Azure Monitor - 03/18/2021 07:35:00 +00:00
[07:35:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:35:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:36:00 INF] Scraping Azure Monitor - 03/18/2021 07:36:00 +00:00
[07:36:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:36:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:37:00 INF] Scraping Azure Monitor - 03/18/2021 07:37:00 +00:00
[07:37:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:37:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:38:00 INF] Scraping Azure Monitor - 03/18/2021 07:38:00 +00:00
[07:38:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:38:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:39:00 INF] Scraping Azure Monitor - 03/18/2021 07:39:00 +00:00
[07:39:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:39:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:40:00 INF] Scraping Azure Monitor - 03/18/2021 07:40:00 +00:00
[07:40:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:40:00 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:41:00 INF] Scraping Azure Monitor - 03/18/2021 07:41:00 +00:00
[07:41:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:41:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:42:00 INF] Scraping Azure Monitor - 03/18/2021 07:42:00 +00:00
[07:42:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:42:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:43:00 INF] Scraping Azure Monitor - 03/18/2021 07:43:00 +00:00
[07:43:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:43:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
´´´

tomkerkhove · 2021-03-18T07:57:27Z

That is true, but the metrics explorer is using average, while you are scraping totals?

Next to that, can you compare it without filter on the subscription? Maybe that's causing an issue that I need to fix.

mtarantino · 2021-03-18T08:15:54Z

That is true, but the metrics explorer is using average, while you are scraping totals?

@tomkerkhove: The issue is the same with average but I wanted to try different aggregation type (on the explorer only average is available whereas you can retrieve Maximum, Total and Minimum from promitor).

As I mentioned earlier when trying the other Aggregation the results are also wrong e.g. with 4 messages in the queue:

Average: 3.2 instead of 4
Min: 0 instead of 4
Total: 16 instead of 20
Max: 4 which is the right value

Next to that, can you compare it without filter on the subscription? Maybe that's causing an issue that I need to fix.

I guess you mean something like removing the filter on the topic/queue like

metrics:
  - name: azure_service_bus_messages
    description: "Count of messages in a Queue/Topic."
    resourceType: ServiceBusNamespace
    labels:
      env: dev
    azureMetricConfiguration:
      metricName: Messages
      dimenstion:
        name: EntityName
      aggregation:
        type: Average
    resources:
      - namespace: myNamespace

Issue is the same (I have 5 messages in the topic)

08:04:35 INF] Scraping Azure Monitor - 03/18/2021 08:04:35 +00:00
[08:04:35 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[08:04:35 INF] Adding Prometheus sink to expose on /metrics
[08:04:35 DBG] Failed to locate the development https certificate at 'null'.
[08:04:35 INF] Now listening on: http://[::]:8080
[08:04:35 INF] Application started. Press Ctrl+C to shut down.
[08:04:35 INF] Hosting environment: Production
[08:04:35 INF] Content root path: /app
...
[08:04:39 INF] Found value 5 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:05:00 INF] Scraping Azure Monitor - 03/18/2021 08:05:00 +00:00
[08:05:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:05:03 INF] Found value 4 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:06:00 INF] Scraping Azure Monitor - 03/18/2021 08:06:00 +00:00
[08:06:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:06:02 INF] Found value 4 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:07:00 INF] Scraping Azure Monitor - 03/18/2021 08:07:00 +00:00
[08:07:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:07:02 INF] Found value 4 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:08:00 INF] Scraping Azure Monitor - 03/18/2021 08:08:00 +00:00
[08:08:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:08:02 INF] Found value 4 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:09:00 INF] Scraping Azure Monitor - 03/18/2021 08:09:00 +00:00
[08:09:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:09:01 INF] Found value 5 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00

It would be interesting to have the api call the azure monitor, this should be the configuration which allows to print to stdout the calls right:

server:
  httpPort: 8080 # Optional. Default: 80
metricSinks:
  prometheusScrapingEndpoint:
    metricUnavailableValue: NaN # Optional. Default: NaN
    enableMetricTimestamps: false # Optional. Default: true
    baseUriPath: /metrics # Optional. Default: /metrics
metricsConfiguration:
  absolutePath: /config/metrics-declaration.yaml # Optional. Default: /config/metrics-declaration.yaml
telemetry:
  containerLogs:
    isEnabled: true # Optional. Default: true
    verbosity: trace # Optional. Default: N/A
  defaultVerbosity: Trace # Optional. Default: error
azureMonitor:
  logging:
    informationLevel: BodyAndHeaders # Optional. Default: Basic
    isEnabled: true # Optional. Default: false

mtarantino · 2021-03-18T08:22:01Z

I also notice for the other queue and topic in my namespace, all values are randomly oscillating at the same time between 2 states:

state1: wrong value (if we assume the queues/topics sizes are constant it's like value = avg - avg/interval)
state2 : correct value

I didn't find any pattern in the oscillation it's random:

state1
state2
state1
state1
state1
state2
state1
state2
state1
state1

tomkerkhove · 2021-03-18T08:34:42Z

I'll have to take a look, sorry!

Just FYI, you should only rely on Average as that's the only supported aggregation: https://docs.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-supported#microsoftservicebusnamespaces

mtarantino · 2021-03-18T12:58:08Z

Just FYI, you should only rely on Average as that's the only supported aggregation: https://docs.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-supported#microsoftservicebusnamespaces

Sure, this is what I initially did, but as the values were not what I expected I tested the other Aggregation to understand what could be wrong.

I'll have to take a look, sorry!

Thanks :)

mtarantino · 2021-03-18T16:17:46Z

I did a little bit more investigation on the Azure Monitor API Level and did some test using (https://docs.microsoft.com/en-us/rest/api/monitor/metrics/list#code-try-0).

I'm assuming the Aggregation Interval time you're using is within the following format

Those are the parameters I used:

resourceUri: /subscriptions/XXXX/resourceGroups/XXXX/providers/Microsoft.ServiceBus/namespaces/myNamespace/
api-version: 2018-01-01
aggregation: average
metricnames: Messages
timespan: 2021-03-17T11:05:00.000Z/2021-03-17T11:10:00.000Z
$filter: EntityName eq 'scc-lt-deadletter'
I did this test and received 5 results and all values are corrects

If I change the timespan: 2021-03-17T11:05:00.000Z/2021-03-17T11:09:59.000Z, I only received 4 results.

Thus I was wondering if the issue could be related to the AzureMonitorClient#GetClosestAggregationInterval() which is sometime providing a wrong Closest Aggregation time and thus instead of having 5 results there are only 4 or something like that.

I also performed a test using the following configuration:

metricDefaults:
  aggregation:
    interval: 00:01:00
  scraping:
    schedule: "0 * * * * *"

The results are also showing unexpected metrics:

16:12:45 INF] Content root path: /app
[16:12:49 INF] Found value 8 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:13:00 INF] Scraping Azure Monitor - 03/18/2021 16:13:00 +00:00
[16:13:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:13:01 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:14:00 INF] Scraping Azure Monitor - 03/18/2021 16:14:00 +00:00
[16:14:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:14:01 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:15:00 INF] Scraping Azure Monitor - 03/18/2021 16:15:00 +00:00
[16:15:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:15:02 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:16:00 INF] Scraping Azure Monitor - 03/18/2021 16:16:00 +00:00
[16:16:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:16:03 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:17:00 INF] Scraping Azure Monitor - 03/18/2021 16:17:00 +00:00
[16:17:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:17:02 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00

tomkerkhove · 2021-03-18T17:39:07Z

Thanks for digging into this!

Thus I was wondering if the issue could be related to the AzureMonitorClient#GetClosestAggregationInterval() which is sometime providing a wrong Closest Aggregation time and thus instead of having 5 results there are only 4 or something like that.

Can you talk a bit more about that please?

You're suggesting that the issue might be in https://github.com/tomkerkhove/promitor/blob/master/src/Promitor.Integrations.AzureMonitor/AzureMonitorClient.cs#L122?

mtarantino · 2021-03-18T20:45:38Z

I notice that for an aggregation interval of 5min with a scraping of 1min the following behaviour for a queue with constant number of messages (e.g. in our case let's take 4).

average = avg-avg/interval (e.g. 4-4/5 = 3.2)
total = total-total/interval (e.g. 20-20/5 = 16)
minimum = 0
maximum = 4

When querying for a metric with an aggregation interval of 5min, I expect promitor to send for example the following timespan AggregationInterval 2021-03-17T11:05:00.000Z/2021-03-17T11:10:00.000Z and received the following data:

{
      "id": "/subscriptions/XXXX/resourceGroups/xxx/providers/Microsoft.ServiceBus/namespaces/xxx/providers/Microsoft.Insights/metrics/Messages",
      "type": "Microsoft.Insights/metrics",
      "name": {
        "value": "Messages",
        "localizedValue": "Count of messages in a Queue/Topic."
      },
      "displayDescription": "Count of messages in a Queue/Topic.",
      "unit": "Count",
      "timeseries": [
        {
          "metadatavalues": [
            {
              "name": {
                "value": "entityname",
                "localizedValue": "entityname"
              },
              "value": "myTopic"
            }
          ],
          "data": [
            {
              "timeStamp": "2021-03-17T11:05:00Z",
              "average": 4
            },
            {
              "timeStamp": "2021-03-17T11:06:00Z",
              "average": 4
            },
            {
              "timeStamp": "2021-03-17T11:07:00Z",
              "average": 4
            },
            {
              "timeStamp": "2021-03-17T11:08:00Z",
              "average": 4
            },
            {
              "timeStamp": "2021-03-17T11:09:00Z",
              "average": 4
            }
          ]
        }
      ],
      "errorCode": "Success"
    }
  ],

To summarise an array T1=[4-4-4-4-4] which is then aggregated according to the aggregation type ( e.g. sum(T1), avg(T1), ,min(T1), max(T1))

My assumption is that somehow it's sending the timespan AggregationInterval 2021-03-17T11:05:00.000Z/2021-03-17T11:09:59.000Z (some a time span < 5 min) and thus because of the time interval being < 5min Azure monitor is only returning 4 values. So the array is going maybe something like T2=[4-4-4-4-null] maybe then null is then replace by 0, I don't know exactly and then all the computation are using this array to calculate the results.

That's why to confirm the behaviour, it would be good to be able to see the API Call in the console using but this is not properly working :( and I'm not an dotnet developper to make it run on my machine and debug it.

azureMonitor:
  logging:
    informationLevel: BodyAndHeaders # Optional. Default: Basic
    isEnabled: true # Optional. Default: false

mtarantino · 2021-03-19T09:06:50Z

@tomkerkhove I've setup the project to debug in deep the application, and I guess I found the issue. The problem lies in the last interval retrieved on the Azure Monitoring call. Azure Monitoring is using the best effort to compute the last minute. As you can see in the diagram below the last minute is dropping to 0 because Azure Monitor didn't have the time to compute the last interval:

Looking at the issue on internet, I found this page azure-monitor/metrics-troubleshoot#chart-shows-unexpected-drop-in-values which is stating :

Depending on the service, the latency of processing metrics can be within a couple minutes range.
 For charts showing a recent time range with a 1- or 5- minute granularity, a drop of the value over 
the last few minutes becomes more noticeable:

And this is by design.

As a solution, there are two way:

Querying time-1 minute :

private async Task<IMetric> GetRelevantMetric(string metricName, AggregationType metricAggregation, TimeSpan metricInterval,
           string metricFilter, string metricDimension, IMetricDefinition metricDefinition, DateTime recordDateTime)
       {
           
           recordDateTime = recordDateTime.AddMinutes(-1);
           var metricQuery = CreateMetricsQuery(metricAggregation, metricInterval, metricFilter, metricDimension, metricDefinition, recordDateTime);

Adapt the configuration scraping.schedule to start as late as possible during the minutes example:

  scraping:
    schedule: "55 * * * * *"

e.g. If the scrap start at 11:00:55 for example, it will give Azure Monitor 55seconds to compute the data between 10:59:00-11:00:00

Both solutions might not work for all metrics as it written in azure-monitor/metrics-troubleshoot#chart-shows-unexpected-drop-in-values, that the latency of processing metrics can be within a couple minutes range. and here we're only giving <1minute.

tomkerkhove · 2021-03-22T19:33:29Z

I'm still looking into this but have to finish #444 first.

In meantime, I've checked and the Azure Monitor insights works but requires trace verbosity, I've verified it and the docs state this: https://promitor.io/configuration/v2.x/runtime/scraper#azure-monitor

tomkerkhove · 2021-04-24T08:16:08Z

This relates to #1290 as well and would stick with aggregation of 5 minutes due to Azure Monitor but I'll think about this issue where you could configure that you want to ignore a time series (but when should we do that, what if it's really 0?)

tomkerkhove · 2022-09-03T12:35:51Z

Are you still seeing this?

mtarantino added the bug Something isn't working label Mar 17, 2021

mtarantino assigned tomkerkhove Mar 17, 2021

tomkerkhove added this to the Scraper - v2.3.0 milestone Apr 24, 2021

tomkerkhove mentioned this issue Apr 24, 2021

Promitor exports metric with name Size to Prometheus that does not match the one in Azure Monitor #1603

Closed

tomkerkhove modified the milestones: Scraper - v2.3.0, Scraper - v2.4.0 May 7, 2021

tomkerkhove modified the milestones: Scraper - v2.4.0, Scraper - v2.5.0 Jul 15, 2021

tomkerkhove modified the milestones: Scraper - v2.5.0, Scraper - v2.6.0 Aug 24, 2021

tomkerkhove modified the milestones: Scraper - v2.6.0, Scraper - v2.7.0 Oct 12, 2021

tomkerkhove added the prio:P0 All issues that are top priority label Oct 12, 2021

tomkerkhove modified the milestones: Scraper - v2.7.0, Scraper - v2.8.0 Dec 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Service Bus Aggregation missing one interval #1563

Service Bus Aggregation missing one interval #1563

mtarantino commented Mar 17, 2021

tomkerkhove commented Mar 17, 2021

mtarantino commented Mar 17, 2021

tomkerkhove commented Mar 17, 2021

mtarantino commented Mar 17, 2021 •

edited

Loading

tomkerkhove commented Mar 17, 2021

mtarantino commented Mar 18, 2021 •

edited

Loading

tomkerkhove commented Mar 18, 2021 •

edited

Loading

mtarantino commented Mar 18, 2021

mtarantino commented Mar 18, 2021

tomkerkhove commented Mar 18, 2021

mtarantino commented Mar 18, 2021

mtarantino commented Mar 18, 2021

tomkerkhove commented Mar 18, 2021

mtarantino commented Mar 18, 2021

mtarantino commented Mar 19, 2021

tomkerkhove commented Mar 22, 2021 •

edited

Loading

tomkerkhove commented Apr 24, 2021

tomkerkhove commented Sep 3, 2022

Service Bus Aggregation missing one interval #1563

Service Bus Aggregation missing one interval #1563

Comments

mtarantino commented Mar 17, 2021

Report

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Component

Version

Configuration

Logs

Platform

Contact Details

tomkerkhove commented Mar 17, 2021

mtarantino commented Mar 17, 2021

tomkerkhove commented Mar 17, 2021

mtarantino commented Mar 17, 2021 • edited Loading

tomkerkhove commented Mar 17, 2021

mtarantino commented Mar 18, 2021 • edited Loading

tomkerkhove commented Mar 18, 2021 • edited Loading

mtarantino commented Mar 18, 2021

mtarantino commented Mar 18, 2021

tomkerkhove commented Mar 18, 2021

mtarantino commented Mar 18, 2021

mtarantino commented Mar 18, 2021

tomkerkhove commented Mar 18, 2021

mtarantino commented Mar 18, 2021

mtarantino commented Mar 19, 2021

tomkerkhove commented Mar 22, 2021 • edited Loading

tomkerkhove commented Apr 24, 2021

tomkerkhove commented Sep 3, 2022

mtarantino commented Mar 17, 2021 •

edited

Loading

mtarantino commented Mar 18, 2021 •

edited

Loading

tomkerkhove commented Mar 18, 2021 •

edited

Loading

tomkerkhove commented Mar 22, 2021 •

edited

Loading