Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: track persistence metrics #10250

Merged
merged 11 commits into from
Aug 20, 2024
Merged

feat: track persistence metrics #10250

merged 11 commits into from
Aug 20, 2024

Conversation

malik672
Copy link
Contributor

@malik672 malik672 commented Aug 10, 2024

closes #10220
closes #10221

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits,

fn on_remove_blocks_above(
&self,
new_tip_num: u64,
sender: oneshot::Sender<()>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't need the sender as argument

fn on_save_blocks(
&self,
blocks: Vec<ExecutedBlock>,
sender: oneshot::Sender<Option<B256>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, the function should return the Option and then caller should send it

@mattsse
Copy link
Collaborator

mattsse commented Aug 10, 2024

@mvares this also closes #10221 as a drive by

@mattsse mattsse added the A-observability Related to tracing, metrics, logs and other observability tools label Aug 10, 2024
@mvares
Copy link
Contributor

mvares commented Aug 10, 2024

@mvares this also closes #10221 as a drive by

All fine! Yesterday I had some problems in my computer, but I already fixed it. Thanks!

@malik672 malik672 requested a review from mattsse August 10, 2024 13:02
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last nits

crates/engine/tree/src/persistence.rs Outdated Show resolved Hide resolved
}
PersistenceAction::PruneBefore(block_num, sender) => {
let start_time = Instant::now();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also move this into prune_before, so that we're consistent with where we track metrics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattsse done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattsse what's latency metrics for?

@malik672 malik672 requested a review from mattsse August 10, 2024 14:03
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, pending @Rjected @fgimenez

crates/engine/tree/src/persistence.rs Outdated Show resolved Hide resolved
crates/engine/tree/src/tree/mod.rs Outdated Show resolved Hide resolved
Copy link
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just one comment on removing self.rx = None after take

crates/engine/tree/src/tree/mod.rs Outdated Show resolved Hide resolved
remove unnecessary code
Comment on lines 18 to 23
/// How long it took for blocks to be removed
pub(crate) remove_blocks_above_duration: Histogram,
/// How long it took for blocks to be saved
pub(crate) save_blocks_duration: Histogram,
/// How long it took for blocks to be pruned
pub(crate) prune_before_duration: Histogram,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// How long it took for blocks to be removed
pub(crate) remove_blocks_above_duration: Histogram,
/// How long it took for blocks to be saved
pub(crate) save_blocks_duration: Histogram,
/// How long it took for blocks to be pruned
pub(crate) prune_before_duration: Histogram,
/// How long it took for blocks to be removed
pub(crate) remove_blocks_above_duration_seconds: Histogram,
/// How long it took for blocks to be saved
pub(crate) save_blocks_duration_seconds: Histogram,
/// How long it took for blocks to be pruned
pub(crate) prune_before_duration_seconds: Histogram,

crates/engine/tree/src/metrics.rs Outdated Show resolved Hide resolved
@@ -15,5 +15,7 @@ pub(crate) struct EngineApiMetrics {
pub(crate) forkchoice_updated_messages: Counter,
/// The total count of new payload messages received.
pub(crate) new_payload_messages: Counter,
/// Histogram of persistence operation durations (in seconds)
pub(crate) persistence_duration: Histogram,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub(crate) persistence_duration: Histogram,
pub(crate) persistence_duration_seconds: Histogram,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

measuring this in seconds is not useful

self.on_new_persisted_block();
} else {
error!("could not find persisted block with hash {last_persisted_block_hash} in memory");
if let Some((mut rx, start_time)) = self.persistence_state.rx.take() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this changes the behavior and we won't panic anymore, do we want this? @mattsse

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, added the expect back

self.last_persisted_block_number = last_persisted_block_number;
self.last_persisted_block_hash = last_persisted_block_hash;
duration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to return the duration here if we not used it for metrics?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're correct we've already used this

malik672 and others added 3 commits August 15, 2024 22:35
Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending @shekhirin

self.last_persisted_block_number = last_persisted_block_number;
self.last_persisted_block_hash = last_persisted_block_hash;
duration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're correct we've already used this

self.on_new_persisted_block();
} else {
error!("could not find persisted block with hash {last_persisted_block_hash} in memory");
if let Some((mut rx, start_time)) = self.persistence_state.rx.take() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, added the expect back

@mattsse mattsse added this pull request to the merge queue Aug 20, 2024
Merged via the queue into paradigmxyz:main with commit ec46fcc Aug 20, 2024
34 checks passed
fgimenez pushed a commit that referenced this pull request Aug 20, 2024
Co-authored-by: Matthias Seitz <matthias.seitz@outlook.de>
Co-authored-by: Alexey Shekhirin <a.shekhirin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-observability Related to tracing, metrics, logs and other observability tools
Projects
None yet
Development

Successfully merging this pull request may close these issues.

extract persistence event handling into standalone functions track persistence duration
5 participants