-
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Null values for metrics if interval is set to 1 minute #1290
Comments
Unfortunately there is a lag up to 5 minutes for Azure Monitor to surface metrics which is what you are seeing here. We fully rely on the Azure API and as you point out in the issue, we don't get the values yet so nothing we can do about it afaik. What you can do, but would have to verify outcomes, is set the scraping interval to 1 minute with an aggregation of 5 minutes. |
@tomkerkhove ok. thanks for the feedback. will try that out |
Sorry for the bad news :/ |
No worries. At least with the 5min aggregation you get the values ;) Nevertheless I have one point which could make sense in my opinion. As the 1 minute interval will more or less always lead to null values which will never bring any added value why not getting up the value ladder until Promitor finds the newest valid value? With that the 1 minute interval would work and you would have a more accurate result. What do you think? Example:
In this case using 3.4 from timestamp 2020-09-22T12:37:00Z instead of null from 2020-09-22T12:39:00Z |
We could do that, but if you want 1 min metric updates, should we report the one from 3 min ago? This will lead to stale metric information which is dangerous/confusing. What's your use-case? We could, and I'm not committing to it yet, give you a flag that says give me the latest metric with a value but that can be tricky as well because some metrics have value null because nothing is reported so it would go all the way back to last measure metric value from last week and report that today. I don't think that's the intent here? |
I think it's not much related to a specific use case. It's more that you would like to have real time metrics from Azure as much as possible which are less aggregated. But I fully see your concern to show older as "new" metrics. This would lead to a big issue if Azure would not provide metrics for more than let's say 4-5 minutes. Sure you could also catch such cases but yes I fully agree it's not a nice way then.. What would be nice is to have this "issue" somehow documented. Setting up an interval of 1 minute will always lead to null/NaN values. From my point of view we can close it by now. Maybe I have once a better idea ;) |
There still are data gaps, for example when querying Azure Cosmos DB it tends to be slow. (FYI @SudhakarNandigam-TomTom) I'm querying the Azure Monitor API at 3:13 PM with an aggregation of 5 min and find the following 2 time series:
Today, Promitor will report
Would this be something you would enable @bluepixbe @SudhakarNandigam-TomTom @adamconnelly @adam-resdiary ? |
@tomkerkhove I've actually left ResDiary now and I'm not in a position to use this, but @ResDiaryLewis or @elliot-resdiary might be interested. |
Sorry to hear and best of luck @adamconnelly! |
@tomkerkhove thanks asking! |
This morning I was thinking about two things:
|
Hey Tom! I don't think that we'd enable this at ResDiary. We rarely see any gaps with our current configuration (1 minute scrape interval, 5 minute aggregation). So it's partly because we're not really affected by this issue and don't plan on changing our configuration anytime soon. However, I also think that this behaviour would be pretty confusing on a dashboard - surely seeing two occurences of Anyway, if it was an opt-in then I can see no reason why not to implement it if you'd like to! |
May I ask what Azure services you are using to scrape? From what I've seen this highly depends on the Azure service that is being used not providing consistent/fast metrics. |
Sure, we're scraping:
|
|
We experience the same with metrics returning null sometime resulting in gaps in our timeseries. Not sure if |
Yes but did not have time yet so I'm open to contributions! |
If we are setting the interval to 1 minute we see often (>90% of the cases) null metric values. I then started to play a bit with the azure metrics api and I think I have found out why.
// 5 minutes interval
GET: https://management.azure.com/subscriptions/SID/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/VMname/providers/microsoft.insights/metrics?api-version=2018-01-01&interval=PT5M
-> is fine
// 1 minute interval
GET: https://management.azure.com/subscriptions/SID/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/VMname/providers/microsoft.insights/metrics?api-version=2018-01-01&interval=PT1M
Body:
As you see the most recent entries don't contain the average attribute. Sometimes the last 2 are missing sometimes only the most recent one. Pretty rare it's fine.
I'm not at all familiar with azure metrics api. Do you know if this this the usual behavior of azure which should be covered in Promitor or is it more related to an azure issue?
Expected Behavior
Found value 3.4 for metric azure_virtual_machine_percentage_cpu with aggregation interval 00:01:00
Actual Behavior
Found value null for metric azure_virtual_machine_percentage_cpu with aggregation interval 00:01:00
Steps to Reproduce the Problem
Configuration
Provide insights in the configuration that you are using:
Used scraping configuration
Specifications
The text was updated successfully, but these errors were encountered: