Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce metrics based on system diagnostics #84

Conversation

aygalinc
Copy link
Contributor

@aygalinc aygalinc commented Nov 3, 2024

Hello !
Here is the start of a PR for : #58
Here is the try to implement the metrics reporting scope on message publishing counter to see if it cope to the thing you have planned.

I have base the impl of system diagnostic metric reporter on the one available on Npgsql : https://github.com/npgsql/npgsql/blob/main/src/Npgsql/MetricsReporter.cs

I m not sure on why you need the NoOpImplem and where the reporter should be instanciated.

@aygalinc aygalinc changed the title Feat/introduce metrics based on system diagnostics feat: introduce metrics based on system diagnostics Nov 3, 2024
Signed-off-by: Gabriele Santomaggio <G.santomaggio@gmail.com>
@Gsantomaggio
Copy link
Member

Thank you for the PR. Please format the code based on the project's rule

Copy link
Member

@Gsantomaggio Gsantomaggio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. There are some changes to make.
May I ask to add:

  • some test
  • an example inside the directory examples with a project ?

RabbitMQ.AMQP.Client/Impl/IMetricsReporter.cs Outdated Show resolved Hide resolved
RabbitMQ.AMQP.Client/Impl/MetricsReporter.cs Outdated Show resolved Hide resolved
RabbitMQ.AMQP.Client/Impl/AmqpConnection.cs Outdated Show resolved Hide resolved
RabbitMQ.AMQP.Client/Impl/AmqpPublisherBuilder.cs Outdated Show resolved Hide resolved
@aygalinc
Copy link
Contributor Author

aygalinc commented Nov 4, 2024

Yup I will add tests when I get some bandwidth

@aygalinc
Copy link
Contributor Author

aygalinc commented Nov 4, 2024

@Gsantomaggio : About the test stack you use, is there any entry point that i can read to make it work loccaly ? Or there is just the code ?

Do you have consider to use things like TestContainer that reduce the burden to have a local manual setup and keep the thing under test fixture control ?

@Gsantomaggio
Copy link
Member

@aygalinc, to run the tests, please refer to the following:
https://github.com/rabbitmq/rabbitmq-amqp-dotnet-client?tab=readme-ov-file#how-to-run
You need a RabbitMQ running.

Do you have consider to use things like TestContainer

We use the same way to test all the clients. We have yet to consider TestContainer.

Thank you

@aygalinc
Copy link
Contributor Author

aygalinc commented Nov 4, 2024

@Gsantomaggio : i have had one test to check the test stack you use. Testing consumer is pretty cumbersome because metrics records happened after consumer is invoked and their is not mean to directly wait that the consumer register metrics before going to the assert part.

I need to check the doc to be able to run this test in parallel, I think i cannot use the static factory pattern of the metrics because it impacts all the test suite so result will not be very consistent.

=> I work on macos and the sh script fails (but i make it work by commenting some part).

@Gsantomaggio
Copy link
Member

=> I work on macos and the sh script fails (but i make it work by commenting some part).

There is some problem with the TLS test on mac. So you can skip them. Don't worry.

Testing consumer is pretty cumbersome

Feel free to add some dependency on the Test project like Prometheus or everything you need to read the metrics externally.

thank you

@aygalinc aygalinc force-pushed the feat/introduce_metrics_based_on_system_diagnostics branch from 27244f7 to 1ca625e Compare November 5, 2024 21:08
@aygalinc aygalinc force-pushed the feat/introduce_metrics_based_on_system_diagnostics branch from 6f82dc7 to bb4e334 Compare November 5, 2024 21:54
@aygalinc
Copy link
Contributor Author

aygalinc commented Nov 5, 2024

@Gsantomaggio : i have add some tests for publisher and consumer metrics.
feel free to check if its good for you.

Tests/Tests.csproj Outdated Show resolved Hide resolved
aygalinc and others added 3 commits November 7, 2024 09:15
Signed-off-by: Gabriele Santomaggio <G.santomaggio@gmail.com>
Signed-off-by: Gabriele Santomaggio <G.santomaggio@gmail.com>
@Gsantomaggio
Copy link
Member

Gsantomaggio commented Nov 7, 2024

@aygalinc CI is finally green. :) Thank you! I will do some tests.
Before merging this PR, we (probably) need to merge this #81.
cc @lukebakken

@aygalinc
Copy link
Contributor Author

aygalinc commented Nov 7, 2024

@Gsantomaggio : I want to add the MetricsReporter available in the public API for dotnet 8 in order to have one impleme that impelment OTEL guidelines and so a MetricReporter that throw not implemented exception for lower version or that extend the NoOp implem.
Does this make sens to you ?

@Gsantomaggio
Copy link
Member

I want to add the MetricsReporter available in the public API for dotnet 8

It is ok with me.

@Gsantomaggio
Copy link
Member

@aygalinc I am working to resolve the conflicts.

Signed-off-by: Gabriele Santomaggio <G.santomaggio@gmail.com>
@Gsantomaggio
Copy link
Member

Gsantomaggio commented Nov 8, 2024

@aygalinc Ok, done. The branch is merged with main

Notes:

  • With this merge, you can run all the tests locally. We fixed the TLS problem.
  • The docker image used is now: pivotalrabbitmq/rabbitmq.
  • If you have an M* series, run ./.ci/ubuntu/one-node/gha-setup.sh start pull arm to run the correct docker image.
  • If you have an Intel series, run ./.ci/ubuntu/one-node/gha-setup.sh start pull to run the correct docker image.

Request:

  • May I ask you to add an example (doc/examples/) of how we can integrate the metrics with OpenTelemetry and/or Prometheus?

Thank you for your effort on this PR.

@aygalinc
Copy link
Contributor Author

@lukebakken thx.
The semantic convention specify the label : db.client.connection.pool.name for db but it s really that they do not the same with messaging

I have add an issue yesterday in sem conv repo to have an explanation on this : open-telemetry/semantic-conventions#1575

@lukebakken
Copy link
Contributor

Yep, I plan on using the same metrics and naming as the Java client (https://github.com/rabbitmq/rabbitmq-amqp-java-client/blob/main/src/main/java/com/rabbitmq/client/amqp/metrics/MicrometerMetricsCollector.java)

@aygalinc
Copy link
Contributor Author

The only issue with the design in npgsql is the fact that all is static in the MetricReporter so poorly testable. I think this is due to the fact that they release it in .net version 7 where IMeterFactory was not available.

@lukebakken
Copy link
Contributor

OK I'll take that into consideration. I don't expect users of this library to have to configure an IMeterFactory implementation, however.

@lukebakken
Copy link
Contributor

lukebakken commented Nov 14, 2024

@aygalinc thank you for the sample application - it makes configuring all of this much more obvious.

Would it be possible to add a metrics summary output? If you'd like to point me in the right direction, that would be great.

UPDATE - I will try this tomorrow: https://learn.microsoft.com/en-us/dotnet/core/diagnostics/metrics-collection#view-metrics-with-dotnet-counters

@aygalinc
Copy link
Contributor Author

@lukebakken Where do you want have the summary ?
image

@lukebakken
Copy link
Contributor

Interesting, how did you get that summary? I was thinking of having that information (or similar) printed out at the end of the demo program's run. I haven't had time yet to try out the dotnet-counters tool (I'm busy working on customer support today).

@aygalinc
Copy link
Contributor Author

If you use Rider you can use the monitorng view, add the desidered MeterName t otrack in the advance settings (link the one we indicate in the otel example) and see the counter live. (Note that they do not print label) :
image

@aygalinc
Copy link
Contributor Author

And on the example i set up the console exporter so periodically it print on stdout some info on measurement (like this one where you can see actual label which help to contextualize the metrics in observability framework):
image

@lukebakken
Copy link
Contributor

Ah! I just didn't let it run long enough 🤦

@lukebakken
Copy link
Contributor

lukebakken commented Nov 18, 2024

For reference: https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/

@aygalinc I'm not exactly sure if OTel attributes should be added in this PR, or in a similar manner to what the AMQP 0.9.1 .NET client does (https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1717/files, https://github.com/rabbitmq/rabbitmq-dotnet-client/blob/main/projects/RabbitMQ.Client/Impl/RabbitMQActivitySource.cs).

I think I'll wrap up this PR without any tags or OTel attributes, and we can add those in another PR. How about that?

@aygalinc
Copy link
Contributor Author

Seems more than reasonable

@aygalinc
Copy link
Contributor Author

@lukebakken I think it is the usage in your histogram to use seconds instead of ms.
There has been lot of discussion like here on this topic : open-telemetry/opentelemetry-specification#2977 (comment) and its recommanded in the doc : https://opentelemetry.io/docs/specs/semconv/general/metrics/#instrument-units .

Do you want to stay on ms for histogram ?
I think the main driver of this choice it s the power of sameness, if everyone implement duration unit as second you never think of the unit of the instrument.

@lukebakken
Copy link
Contributor

Thanks for pointing that out. Seems like you'd want to use ms, but I'll change the PR to follow the guidelines.

Copy link
Contributor

@lukebakken lukebakken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aygalinc thank you very much!

Copy link
Member

@Gsantomaggio Gsantomaggio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @aygalinc @lukebakken!

Thanks a lot for your effort @aygalinc in this PR

@Gsantomaggio Gsantomaggio merged commit f47744a into rabbitmq:main Nov 19, 2024
2 checks passed
@aygalinc aygalinc deleted the feat/introduce_metrics_based_on_system_diagnostics branch November 25, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants