-
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Service Bus Aggregation missing one interval #1563
Comments
Thanks for reporting. Is it possible to add screenshot of the metrics explorer in the Azure portal for these metrics please? |
@tomkerkhove, attached a screenshot of the metrics explorer on Azure portal: Another example with the following
Logs: [13:28:57 INF] Scraping Azure Monitor - 03/17/2021 13:28:57 +00:00
[13:28:57 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:28:57 INF] Adding Prometheus sink to expose on /metrics
[13:28:57 DBG] Failed to locate the development https certificate at 'null'.
[13:28:57 INF] Now listening on: http://[::]:8080
[13:28:57 INF] Application started. Press Ctrl+C to shut down.
[13:28:57 INF] Hosting environment: Production
[13:28:57 INF] Content root path: /app
[13:29:00 INF] Found value 4 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:29:00 INF] Scraping Azure Monitor - 03/17/2021 13:29:00 +00:00
[13:29:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:29:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:30:00 INF] Scraping Azure Monitor - 03/17/2021 13:30:00 +00:00
[13:30:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:30:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:31:00 INF] Scraping Azure Monitor - 03/17/2021 13:31:00 +00:00
[13:31:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:31:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:32:00 INF] Scraping Azure Monitor - 03/17/2021 13:32:00 +00:00
[13:32:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:32:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:33:00 INF] Scraping Azure Monitor - 03/17/2021 13:33:00 +00:00
[13:33:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:33:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:34:00 INF] Scraping Azure Monitor - 03/17/2021 13:34:00 +00:00
[13:34:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:34:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:35:00 INF] Scraping Azure Monitor - 03/17/2021 13:35:00 +00:00
[13:35:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:35:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00 As you can see:
First scrapping is returning the proper value I though that it might be cause the 2 api calls are too close, but when trying with a fixed time in the schedule (so the time between the first scrap which I guess is part of a kind of initialisation and the next one is greater than 3 minutes, the results is the same) |
Hm odd, is the metric explorer also using 00:05:00 as aggregation? Frankly, we can only scrape the API with the same configuration but there is no data manipulation that we are doing. |
This is the metrics configuration:
I would need to dig into the api call, I saw in runtime/scraper#azure-monitor that's it's possible to get the Azure Monitor API call but didn't manage to make it work with my configuration, is there something wrong with the following config:
|
I was more referring to the aggregation in te metric explorer of the Azure Portal, can you check please? |
@tomkerkhove : Ah, yes the aggregation in the Azure Portal Metric explorer (called "time granularity" on the interface) is set to 5 minutes. I've done some additional test and let promitor runs over night with the metrics configuration
This is the Metric Explorer screenshot: And this are the logs, we can see that the
|
That is true, but the metrics explorer is using average, while you are scraping totals? Next to that, can you compare it without filter on the subscription? Maybe that's causing an issue that I need to fix. |
@tomkerkhove: The issue is the same with average but I wanted to try different aggregation type (on the explorer only average is available whereas you can retrieve Maximum, Total and Minimum from promitor). As I mentioned earlier when trying the other Aggregation the results are also wrong e.g. with 4 messages in the queue:
I guess you mean something like removing the filter on the topic/queue like
Issue is the same (I have 5 messages in the topic)
It would be interesting to have the api call the azure monitor, this should be the configuration which allows to print to stdout the calls right:
|
I also notice for the other queue and topic in my namespace, all values are randomly oscillating at the same time between 2 states:
I didn't find any pattern in the oscillation it's random:
|
I'll have to take a look, sorry! Just FYI, you should only rely on Average as that's the only supported aggregation: https://docs.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-supported#microsoftservicebusnamespaces |
Sure, this is what I initially did, but as the values were not what I expected I tested the other Aggregation to understand what could be wrong.
Thanks :) |
I did a little bit more investigation on the Azure Monitor API Level and did some test using (https://docs.microsoft.com/en-us/rest/api/monitor/metrics/list#code-try-0). I'm assuming the Aggregation Interval time you're using is within the following format Those are the parameters I used:
If I change the Thus I was wondering if the issue could be related to the AzureMonitorClient#GetClosestAggregationInterval() which is sometime providing a wrong Closest Aggregation time and thus instead of having 5 results there are only 4 or something like that. I also performed a test using the following configuration:
The results are also showing unexpected metrics:
|
Thanks for digging into this!
Can you talk a bit more about that please? You're suggesting that the issue might be in https://github.com/tomkerkhove/promitor/blob/master/src/Promitor.Integrations.AzureMonitor/AzureMonitorClient.cs#L122? |
I notice that for an aggregation interval of
When querying for a metric with an aggregation interval of
To summarise an array My assumption is that somehow it's sending the timespan AggregationInterval That's why to confirm the behaviour, it would be good to be able to see the API Call in the console using but this is not properly working :( and I'm not an dotnet developper to make it run on my machine and debug it.
|
@tomkerkhove I've setup the project to debug in deep the application, and I guess I found the issue. The problem lies in the last interval retrieved on the Azure Monitoring call. Azure Monitoring is using the best effort to compute the last minute. As you can see in the diagram below the last minute is dropping to 0 because Azure Monitor didn't have the time to compute the last interval: Looking at the issue on internet, I found this page azure-monitor/metrics-troubleshoot#chart-shows-unexpected-drop-in-values which is stating :
And this is by design. As a solution, there are two way:
e.g. If the scrap start at Both solutions might not work for all metrics as it written in azure-monitor/metrics-troubleshoot#chart-shows-unexpected-drop-in-values, that |
I'm still looking into this but have to finish #444 first. In meantime, I've checked and the Azure Monitor insights works but requires |
This relates to #1290 as well and would stick with aggregation of 5 minutes due to Azure Monitor but I'll think about this issue where you could configure that you want to ignore a time series (but when should we do that, what if it's really 0?) |
Are you still seeing this? |
Report
Using the latest docker image, I'm scrapping Service Bus metrics and counting messages in a Queue/Topic. The aggregation is not properly computed according to my expectation.
Details:
I'm using
metricDefaults.aggregation.interval
of00:05:00
andmetricDefaults.scraping.schedule
of0 * * ? * *
.Expected Behavior
My queue has a constant sized of
4
messages in the queue. I expected the aggregation to be for:Average
: 4Minimum
: 4Total
: 20Actual Behavior
My queue has a constant sized of
4
messages in the queue. I see the following aggregation:Average
: 3.2Min
: 0Total
: 16Even after running the scrapper for 10 minutes, the results doesn't change. It's as if the aggregation was missing one value (e.g.
[4-4-4-4-0]
instead of[4-4-4-4-4]
). When I'm usingMaximum
the value is correct.Steps to Reproduce the Problem
4
messages in an Azure Service Bus queue (without consuming them)Component
Scraper
Version
2.1.1
Configuration
runtime.yaml
metrics-declaration.yaml
Logs
Unfortunately I despite enabling
azureMonitor
in my runtime configuration I was not able to get the Azure Monitor API call logged (not sure why...)For Average:
For Total
Platform
Other
Contact Details
tarantino.dev@gmail.com
The text was updated successfully, but these errors were encountered: