Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN metric is not reported #65

Closed
tomkerkhove opened this issue May 10, 2019 · 32 comments
Closed

NaN metric is not reported #65

tomkerkhove opened this issue May 10, 2019 · 32 comments
Assignees
Milestone

Comments

@tomkerkhove
Copy link

Not A Number (NaN) is a supported metric according to EXPOSITION FORMATS

value is a float represented as required by Go's ParseFloat() function. In addition to standard numerical values, Nan, +Inf, and -Inf are valid values representing not a number, positive infinity, and negative infinity, respectively.

However, this is not tracked when doing the following:

var nanGauge = Prometheus.Client.Metrics.CreateGauge("nan", "metric with nan value");
nanGauge.Set(double.NaN);

Expected Outcome

# HELP nan metric with nan value
# TYPE nan gauge
Nan

Actual Outcome

# HELP nan metric with nan value
# TYPE nan gauge

Repro

Pull https://github.com/tomkerkhove/prometheus-nan-repro and consume /repro endpoint.

@sanych-sun
Copy link
Member

Hi @tomkerkhove

Thank you for reporting the problem, but could you please provide some more details about you use-case? We suppress NaN value in Gauge for the reason: it's hard to define predictable behavior for the Gauge in such scenario. What if value was set to NaN and then try to Inc? Should we consider NaN as 0 and result to current value be 1? But NaN + 1 results to NaN in .net. Or should we work with NaN according to .net rules? Than gauge will "not accept" any changes unless user's code explicitly set it to some other value.

We should define behavior before make any changes to the library. Let's discuss your use-case and create some proposal.

Thanks

@tomkerkhove
Copy link
Author

If you are doing this deliberately I think an exception would'be been better rather than supressing as this is a bit strange.

My use case is simple - I have a metric and sometimes I need to report NaN 🤷‍♂ I only use Set, never use Inc/Dec.

@sanych-sun
Copy link
Member

sanych-sun commented May 10, 2019

Ok, so here is my suggestion:

  1. gauge should be changed to throw on NaN value to provide better experience
  2. untyped metric, which is suppress NaN for the moment should accept it

So you can use untyped instead of gauge for your needs. I can create PR tonight if the suggestion works for you.

@tomkerkhove
Copy link
Author

Well, in theory, is it still a gauge 🤔

Can we have 2 gauges? Or be able to define a baseline for NaN so you can inc/dec?
What are you doing today when somebody passes NaN and then calls Inc? You store the previous value then or?

@tomkerkhove
Copy link
Author

Where I'm going is that there should maybe be 2 gauges - One which only has set and supports NaN, and one which has Inc/Dec?

@tomkerkhove
Copy link
Author

What if the Gauge creation allows me to say that it's acceptable to pass NaN and it's up to me to only use Set?

Example:

var nanGauge = Prometheus.Client.Metrics.CreateGauge("nan", "metric with nan value", suppressNaN: false);
nanGauge.Set(double.NaN);

@sanych-sun
Copy link
Member

It's not a gauge, but untyped which is exported as untyped metric.

The main difference that there is Set only, no Inc or Dec methods.

@tomkerkhove
Copy link
Author

In our case, we need a gauge - https://prometheus.io/docs/concepts/metric_types/#gauge.

Based on this our scenario is still valid as far as I see:
We have a metric that can either go up or down. If we don't have a value, and 0 is not representable, then we should use NaN

https://prometheus.io/docs/practices/instrumentation/#avoid-missing-metrics

Gauges can be set, go up, and go down. They are useful for snapshots of state, such as in-progress requests, free/total memory, or temperature. You should never take a rate() of a gauge.

Gauges can be set, go up, and go down. They are useful for snapshots of state, such as in-progress requests, free/total memory, or temperature. You should never take a rate() of a gauge.

@tomkerkhove
Copy link
Author

Example:

var nanGauge = Prometheus.Client.Metrics.CreateGauge("nan", "metric with nan value", suppressNaN: false);
nanGauge.Set(double.NaN);

I think this is the best approach, personally. It tracks as a Gauge & allows me to use NaN when I don't have a decent metric.

@sanych-sun
Copy link
Member

The problem with NaN for gauge that if you set it's value to NaN, and then use Inc... the current value will be NaN... it might be unexpected and hard to predict behavior.

Why can't you use untyped instead? Currently Prometheus server ignores type information at all:

The Prometheus client libraries offer four core metric types. These are currently only differentiated in the client libraries (to enable APIs tailored to the usage of the specific types) and in the wire protocol. The Prometheus server does not yet make use of the type information and flattens all data into untyped time series.

Untyped metric has no such logical problem in work with NaN as Gauge has, because there are no methods except Set.

@sanych-sun
Copy link
Member

Or after all you can implement and register into the CollectorsRegistry you own metric, which will work as you want and reports itself as a Gauge (It's easy after the V3 release by using IMetricWriter). I can create an example for you if you prefer this way.

@tomkerkhove
Copy link
Author

The reason for a gauge is that human consumers know what to expect but I get your point, although I'm still not convinced (sorry 😅)

Feel free to post a sample, maybe that's a good alternative!

@sanych-sun
Copy link
Member

Ok, so do you need labels support for the metric or just a single unlabeled value?

@tomkerkhove
Copy link
Author

We are using labels indeed! But take your time, it's not urgent.

@sanych-sun sanych-sun self-assigned this May 13, 2019
@sanych-sun
Copy link
Member

sanych-sun commented May 14, 2019

Hi @tomkerkhove !

Here is example code for custom metric which uses stub as a datasource. All you are need - just register instance of RaceMetric class in your collectors registry (Metrics.DefaultCollectorRegistry if you are using the default one).

Custom collector should implement ICollector interface, where:

  • Configuration property, should have all configuration data needed to create instance of your collector, useful if you use ICollectorRegistry.GetOrAdd for registration
  • MetricNames property, should return all metric names produced by the collector
  • Collect method, which is called on each scrape to produce actual output via the IMetricsWriter
using System.Collections.Generic;
using System;

using Prometheus.Client;
using Prometheus.Client.Collectors;
using Prometheus.Client.Collectors.Abstractions;
using Prometheus.Client.MetricsWriter;
using Prometheus.Client.MetricsWriter.Abstractions;

namespace CustomMetricExample
{
    public class RaceMetric : ICollector
    {
        private readonly IEnumerator<(string Car, double Speed)[]> src = new RaceDataSource().Get();

        private const string MetricName = "race_car_speed";

        public ICollectorConfiguration Configuration { get; } = new CollectorConfiguration("my_custom_race_metric");

        public IReadOnlyList<string> MetricNames { get; } = new[] { MetricName };

        public void Collect(IMetricsWriter writer)
        {
            src.MoveNext();

            writer.WriteMetricHeader(MetricName, MetricType.Gauge);
            foreach (var item in src.Current)
            {
                writer.WriteSample(item.Speed, string.Empty, new[] { new KeyValuePair<string, string>("car", item.Car) });
            }
        }
    }

    public class RaceDataSource
    {
        public IEnumerator<(string Car, double Speed)[]> Get()
        {
            yield return new[] { ("red", 0d), ("green", 0d), ("blue", 0d) };

            yield return new[] { ("red", 10d), ("green", 12d), ("blue", 20d) };

            yield return new[] { ("red", 30d), ("green", 24d), ("blue", 50d) };

            yield return new[] { ("red", 50d), ("green", 45d), ("blue", 80d) };

            yield return new[] { ("red", 100d), ("green", 90d), ("blue", 120d) };

            yield return new[] { ("red", 50d), ("green", 75d), ("blue", double.NaN) };

            yield return new[] { ("red", 75d), ("green", double.NaN), ("blue", double.NaN) };

            while (true)
            {
                yield return new[] { ("red", double.NaN), ("green", double.NaN), ("blue", double.NaN) };
            }
        }
    }
}

Please let me know if this helps.

@sanych-sun
Copy link
Member

@tomkerkhove Could you please let us know if the suggestion above helps?

Thanks

@tomkerkhove
Copy link
Author

Thanks for the sample, I'll have a look at soon.

@sanych-sun
Copy link
Member

Hi @tomkerkhove

Is this still actual or we can close the issue?

Thanks

@tomkerkhove
Copy link
Author

I still need to give it a spin but will close for now. Thanks!

@tomkerkhove
Copy link
Author

tomkerkhove commented Aug 4, 2019

I've had a look at the sample and makes it a lot more complex for consumers, just to track a NaN on a Gauge. How would it work if I want to have the same experience as the following, but for a custom metric:

var gauge = Metrics.CreateGauge("gauge", "help text");
gauge.Set(5.3);

I still believe that, if it's opt-in, and I call Inc/Dec on a NaN it's up to me to fix it or just use Set.

I tried to extend the current gauge but the factory method makes it a bit more cumbersome.

However, if you really think it's best not to remove the check then that's ok for me. Will go with an untyped metric then, unfortunately.

@sanych-sun
Copy link
Member

I'll reopen the issue so we can discuss the changes.

Here is new suggestion:
We can allow NaN as a valid value for Gauge via method Set, but throw when user's code try to use Inc or Dec if current value is NaN. This approach allow us to support NaN and still have predictable behavior.

@tomkerkhove and @phnx47 does such suggestion works/makes sense?

Thanks

@sanych-sun sanych-sun reopened this Aug 7, 2019
@sanych-sun sanych-sun added this to the 3.1 milestone Aug 7, 2019
@tomkerkhove
Copy link
Author

That would be prefect for me! Thanks for considering!

@tomkerkhove
Copy link
Author

We are holding off https://github.com/tomkerkhove/promitor v1.0 until this is sorted out.

Feel free to let me know what you think of this and if you have the bandwidth. If not, I might be able to contribute this if you want.

@sanych-sun
Copy link
Member

Wow! Sounds like we have to change our plans a little to make this live ASAP. I think it will take day or two to get development build so you can check everything on your side. And then we can cut the production build.

I'll keep you informed on the progress.

@tomkerkhove
Copy link
Author

Thanks and don't rush it, if we have to wait then that's it - I don't expect you to drop everything and implement this :)

Here is new suggestion:
We can allow NaN as a valid value for Gauge via method Set, but throw when user's code try to use Inc or Dec if current value is NaN. This approach allow us to support NaN and still have predictable behavior.

In terms of your suggestion, don't feel obliged to roll this out for everybody perse, if you want to use a supressNaN which is true by default, then that's ok for me!

sanych-sun added a commit that referenced this issue Aug 14, 2019
phnx47 added a commit that referenced this issue Aug 14, 2019
@tomkerkhove
Copy link
Author

Pulled in the MyGet package and it works:

# HELP demo_servicebusqueue_queue_size Amount of active messages of the 'myqueue' queue (determined with ServiceBusQueue provider)
# TYPE demo_servicebusqueue_queue_size gauge
demo_servicebusqueue_queue_size{resource_group="promitor",subscription_id="0f9d7fea-99e8-4768-8672-06a28514f77e",resource_uri="subscriptions/0f9d7fea-99e8-4768-8672-06a28514f77e/resourceGroups/promitor/providers/Microsoft.ServiceBus/namespaces/promitor-messaging",instance_name="promitor-messaging",entity_name="orders"} NaN 1565779025042

Thank you!

@sanych-sun
Copy link
Member

@tomkerkhove, sounds good!

Do we need to create a prod package ASAP or you OK for now and it can wait week or two (for other issues in scope of 3.1 to be sorted)?

@tomkerkhove
Copy link
Author

I won't ship v1.0 on the preview package but it's ok to wait a week or two, no worries!

@sanych-sun
Copy link
Member

OK. I'll close the issue and let's plan it to be included into upcoming 3.1 release. If you need the functionality to be available earlier - just let us know and we will cut the build.

Thanks.

@tomkerkhove
Copy link
Author

Just out of curiosity - When are you planning to ship 3.1?

@sanych-sun
Copy link
Member

sanych-sun commented Aug 27, 2019

Hi @tomkerkhove !

I've just finished working on performance optimization. Now validating the changes and testing the library. I think we will be ready to release early next week or so.

Is it OK with your release or should we rush the release?

@tomkerkhove
Copy link
Author

Sure that's fine - Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants