Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some more metrics #620

Merged
merged 37 commits into from
Sep 23, 2022
Merged

Add some more metrics #620

merged 37 commits into from
Sep 23, 2022

Conversation

CHr15F0x
Copy link
Member

@CHr15F0x CHr15F0x commented Sep 20, 2022

This PR adds more counters. The vast majority of the PR content is tests for those, I tried to keep the production code delta as small as possible. Any ideas for more metrics will be put into a separate PR.

The number of commits is pretty big but I recommend going through the code on a per-file basis and skipping the tests first.

RPC counters

rpc_method_calls_total, uses a method label to distinguish between RPC methods

Sequencer counters

sequencer_requests_total
sequencer_requests_failed_total

EDIT: failure reasons are labels now
sequencer_requests_failed_starknet_total - requests failed due to StarkNet specific errors
sequencer_requests_failed_decode_total - requests failed due to deserialization errors
sequencer_requests_failed_rate_limited_total - requests failed due to rate limiting

  1. All of the above use a method label to distinguish between request types (get_block, get_state_updates, etc.).
  2. Additionally get_block, get_state_updates can use a tag label which is either latest or pending.
  3. Failure counters can also be filtered using the reason label, with the following values:
    a. decode - requests failed due to deserialization errors
    b. starknet - requests failed due to StarkNet specific errors
    c. rate_limiting - requests failed due to rate limiting
  4. The rationale for sequencer_requests_failed_decode_total decode is in the readme too, but long story short - once in a while these happen in bursts because of a cairo lang update and I imagine users would like to filter those out to avoid obfuscating the failures that happen on a daily basis.

Middleware

I think that wrapping reqwest into any existing or custom made middleware crate would be an overkill so I went with a simple macro and some simple wrappers. The macro frees developers from having to remember which and if all counters were registered. Additionally it produces a list of all the methods as & 'static str which increases the performance a bit, if we for example compare it to the rpc middleware passing non-static &str.

Testing

metrics uses a singleton Recorder, which is a bit problematic when trying to test metrics

TL;DR:

  • if a test asserts the value of counter x, use RecorderGuard::lock(MyLocalMockRecorderInstance)
  • other tests that touch counter x should use RecorderGuard::lock_as_noop()
  • other tests that don't touch counter x don't care about RecorderGuard

I updated the RecorderGuard to decrease contention between tests that do not assert any counters. Unfortunately if one wishes to have repetitive results in a test which does assert some counter values, all other tests that could be triggering those counters concurrently have to use RecorderGuard::lock_as_noop().

@CHr15F0x CHr15F0x requested a review from a team as a code owner September 20, 2022 09:48
@CHr15F0x CHr15F0x force-pushed the CHr15F0x/more_metrics branch from 9a4f307 to e6e6681 Compare September 21, 2022 11:10
@Mirko-von-Leipzig
Copy link
Contributor

Meta: should we be testing these metrics? It adds a lot of extra code; is it worth it?

@CHr15F0x
Copy link
Member Author

Meta:

  1. My first thought was: nope, it's just counters.
  2. My second thought was: implicitly or explicitly metrics are a public API so it's better to have some tests in case we refactor around the sequencer client, especially if a lot of labels come into play. It's fairly easy to break something there.

Copy link
Contributor

@kkovaacs kkovaacs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than that small naming nitpick.

@CHr15F0x CHr15F0x force-pushed the CHr15F0x/more_metrics branch from 368216a to 6eda060 Compare September 23, 2022 09:25
@CHr15F0x CHr15F0x merged commit ada7544 into main Sep 23, 2022
@CHr15F0x CHr15F0x deleted the CHr15F0x/more_metrics branch September 23, 2022 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants