Add some more metrics #620

CHr15F0x · 2022-09-20T09:48:35Z

This PR adds more counters. The vast majority of the PR content is tests for those, I tried to keep the production code delta as small as possible. Any ideas for more metrics will be put into a separate PR.

The number of commits is pretty big but I recommend going through the code on a per-file basis and skipping the tests first.

RPC counters

rpc_method_calls_total, uses a method label to distinguish between RPC methods

Sequencer counters

sequencer_requests_total
sequencer_requests_failed_total

EDIT: failure reasons are labels now
~~sequencer_requests_failed_starknet_total - requests failed due to StarkNet specific errors~~
~~sequencer_requests_failed_decode_total - requests failed due to deserialization errors~~
~~sequencer_requests_failed_rate_limited_total - requests failed due to rate limiting~~

All of the above use a method label to distinguish between request types (get_block, get_state_updates, etc.).
Additionally get_block, get_state_updates can use a tag label which is either latest or pending.
Failure counters can also be filtered using the reason label, with the following values:
a. decode - requests failed due to deserialization errors
b. starknet - requests failed due to StarkNet specific errors
c. rate_limiting - requests failed due to rate limiting
The rationale for ~~sequencer_requests_failed_decode_total~~ decode is in the readme too, but long story short - once in a while these happen in bursts because of a cairo lang update and I imagine users would like to filter those out to avoid obfuscating the failures that happen on a daily basis.

Middleware

I think that wrapping reqwest into any existing or custom made middleware crate would be an overkill so I went with a simple macro and some simple wrappers. The macro frees developers from having to remember which and if all counters were registered. Additionally it produces a list of all the methods as & 'static str which increases the performance a bit, if we for example compare it to the rpc middleware passing non-static &str.

Testing

metrics uses a singleton Recorder, which is a bit problematic when trying to test metrics

TL;DR:

if a test asserts the value of counter x, use RecorderGuard::lock(MyLocalMockRecorderInstance)
other tests that touch counter x should use RecorderGuard::lock_as_noop()
other tests that don't touch counter x don't care about RecorderGuard

I updated the RecorderGuard to decrease contention between tests that do not assert any counters. Unfortunately if one wishes to have repetitive results in a test which does assert some counter values, all other tests that could be triggering those counters concurrently have to use RecorderGuard::lock_as_noop().

crates/pathfinder/src/rpc/v01.rs

crates/pathfinder/src/sequencer.rs

README.md

Mirko-von-Leipzig · 2022-09-21T15:17:25Z

Meta: should we be testing these metrics? It adds a lot of extra code; is it worth it?

CHr15F0x · 2022-09-21T15:33:03Z

Meta:

My first thought was: nope, it's just counters.
My second thought was: implicitly or explicitly metrics are a public API so it's better to have some tests in case we refactor around the sequencer client, especially if a lot of labels come into play. It's fairly easy to break something there.

kkovaacs

LGTM other than that small naming nitpick.

crates/pathfinder/src/monitoring/metrics.rs

And protect from test interference.

Because this test calls `sequencer::Client::block`.

CHr15F0x requested a review from a team as a code owner September 20, 2022 09:48

CHr15F0x commented Sep 20, 2022

View reviewed changes

crates/pathfinder/src/rpc/v01.rs Show resolved Hide resolved

CHr15F0x commented Sep 20, 2022

View reviewed changes

crates/pathfinder/src/sequencer.rs Show resolved Hide resolved

CHr15F0x commented Sep 20, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

CHr15F0x force-pushed the CHr15F0x/more_metrics branch from 9a4f307 to e6e6681 Compare September 21, 2022 11:10

Mirko-von-Leipzig approved these changes Sep 21, 2022

View reviewed changes

README.md Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

kkovaacs approved these changes Sep 22, 2022

View reviewed changes

crates/pathfinder/src/monitoring/metrics.rs Outdated Show resolved Hide resolved

CHr15F0x added 21 commits September 23, 2022 11:24

feat: add rpc method failure counter

0248722

test: add a specific test for per method counters

3f7328d

refactor: deduplicate integration client creation

0c07258

feat: sequencer request call counter

0360600

feat: sequencer requests failed counter

c5ad64b

feat: add counter for rate limited requests

e663e28

feat: add more counters, count latest and pending

1d2b97f

refactor: move fake recorder to metrics/test

bbd5524

feat: register counters for last and pending

752f7f5

refactor: move all sequencer metrics stuff to one module

2e55568

test: add alternative setup to serve metrics tests

8751238

test: add a basic test cas

b222cc1

test: rename utility fn

191cd74

test: add counter getter with arbitrary label support

403c8d3

test: add cases for latest and pending tags

6faeabb

refactor: move tests to a separate module

4ef083b

test: reduce contention on recorder guard

2089bb0

And protect from test interference.

doc: add missing comments

cde5085

test: fix chain_id test after rebase

67599e7

test: add a missing guard

9846d89

Because this test calls `sequencer::Client::block`.

doc: polish docs of RecorderGuard

dd97b20

CHr15F0x added 16 commits September 23, 2022 11:24

test: add response_owned

41f6b83

test: revert response to &'static str

9e1920b

test: remove generics from setup_with_varied_responses

ac58b07

doc: add missing docs for test helpers

fbbd0a1

chore: clippy

5f21438

doc: update readme

ce1ff13

refactor: rename wrap_with_metrics

ffbc7fa

refactor: failure reason into a label

71a0eb3

test: fix the test to follow new labels

5f76d3e

refactor: introduce some mod-local consts

a9bc436

doc: update readme

bf8e4aa

refactor: rename local var

49f0f2c

refactor: rename sequencer_* metrics into gateway_*

adc7afe

refactor: remove underscore from used arg

4c24fb2

chore: clippy

da0cd75

chore: fmt

6eda060

CHr15F0x force-pushed the CHr15F0x/more_metrics branch from 368216a to 6eda060 Compare September 23, 2022 09:25

CHr15F0x merged commit ada7544 into main Sep 23, 2022

CHr15F0x deleted the CHr15F0x/more_metrics branch September 23, 2022 09:38

CHr15F0x mentioned this pull request Oct 25, 2022

Add node metrics with exporter #258

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add some more metrics #620

Add some more metrics #620

CHr15F0x commented Sep 20, 2022 •

edited

Loading

Mirko-von-Leipzig commented Sep 21, 2022

CHr15F0x commented Sep 21, 2022

kkovaacs left a comment

Add some more metrics #620

Add some more metrics #620

Conversation

CHr15F0x commented Sep 20, 2022 • edited Loading

RPC counters

Sequencer counters

Middleware

Testing

Mirko-von-Leipzig commented Sep 21, 2022

CHr15F0x commented Sep 21, 2022

kkovaacs left a comment

Choose a reason for hiding this comment

CHr15F0x commented Sep 20, 2022 •

edited

Loading