-
-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(subscriber): resource instrumentation #77
Conversation
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
console-subscriber/src/aggregator.rs
Outdated
if let Some(mut task_stats) = self.task_stats.update(&id) { | ||
task_stats.poll_stats.current_polls -= 1; | ||
if task_stats.poll_stats.current_polls == 0 { | ||
if let Some(last_poll_started) = task_stats.poll_stats.last_poll_started { | ||
let elapsed = at.duration_since(last_poll_started).unwrap(); | ||
task_stats.poll_stats.last_poll_ended = Some(at); | ||
task_stats.poll_stats.busy_time += elapsed; | ||
task_stats | ||
.poll_times_histogram | ||
.record(elapsed.as_nanos().try_into().unwrap_or(u64::MAX)) | ||
.unwrap(); | ||
} | ||
} | ||
} | ||
|
||
if let Some(mut async_op_stats) = self.async_op_stats.update(&id) { | ||
async_op_stats.poll_stats.current_polls -= 1; | ||
if async_op_stats.poll_stats.current_polls == 0 { | ||
if let Some(last_poll_started) = async_op_stats.poll_stats.last_poll_started | ||
{ | ||
let elapsed = at.duration_since(last_poll_started).unwrap(); | ||
async_op_stats.poll_stats.last_poll_ended = Some(at); | ||
async_op_stats.poll_stats.busy_time += elapsed; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make this a bit more DRY
console-subscriber/src/aggregator.rs
Outdated
if let Some(mut task_stats) = self.task_stats.update(&id) { | ||
if task_stats.poll_stats.current_polls == 0 { | ||
task_stats.poll_stats.last_poll_started = Some(at); | ||
if task_stats.poll_stats.first_poll == None { | ||
task_stats.poll_stats.first_poll = Some(at); | ||
} | ||
task_stats.poll_stats.polls += 1; | ||
} | ||
task_stats.poll_stats.current_polls += 1; | ||
} | ||
|
||
if let Some(mut async_op_stats) = self.async_op_stats.update(&id) { | ||
if async_op_stats.poll_stats.current_polls == 0 { | ||
async_op_stats.poll_stats.last_poll_started = Some(at); | ||
if async_op_stats.poll_stats.first_poll == None { | ||
async_op_stats.poll_stats.first_poll = Some(at); | ||
} | ||
async_op_stats.poll_stats.polls += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this as well...
console-subscriber/src/lib.rs
Outdated
} else if meta.name() == "runtime.resource" { | ||
self.resource_callsites.insert(meta); | ||
} else if meta.name() == "runtime.async_op" { | ||
self.async_op_callsites.insert(meta); | ||
} else if meta.target() == "tokio::resource::op" { | ||
self.resource_op_callsites.insert(meta); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to do that, we probably need to change the way Callsites
work and drop the 32 bound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there likely more than 32 ops in Tokio? That doesn't sound crazy... Perhaps Callsites
could get a const generic for the size of the array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#87 makes Callsites
have a const generic parameter for length, so we can make bigger callsite arrays for resource ops.
if there are a lot of resource ops, though, we may want to use a hashmap or something instead; there's probably a point at which linear search is no longer efficient...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've so far reviewed all the files except for aggregator.rs
and lib.rs
. Those big 'uns will be tomorrow. But so far, the way you've separated things in the proto files makes sense to me! Just a couple questions inline.
console-subscriber/src/visitors.rs
Outdated
|
||
fn record_str(&mut self, field: &tracing_core::Field, value: &str) { | ||
let extract_partial_attr = || -> Option<MatchPart> { | ||
let parts: Vec<_> = field.name().split('_').collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This temporary vec is an allocation we can skip, just calling next()
3 times.
console-subscriber/src/visitors.rs
Outdated
} | ||
|
||
fn record_u64(&mut self, field: &tracing_core::Field, value: u64) { | ||
let parts: Vec<_> = field.name().split('_').collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
proto/resources.proto
Outdated
// The resources's concrete rust type. | ||
string concrete_type = 3; | ||
// The kind of resource (e.g timer, mutex) | ||
string kind = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could save some allocations and wire size if we the kind
a oneof
with a bunch of "common" kinds, and another kind that is a string. Kinda like linkerd2's HttpMethod
.
I don't mean this as a hard requirement now, but could be a good optimization maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that makes sense indeeed
proto/resource_ops.proto
Outdated
// | ||
// This uniquely identifies this op across all *currently live* | ||
// ones. | ||
common.MetaId id = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I fully understand the difference of these 3 IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have clarified this more.
common.MetaId metadata
is the id of the metadata for tracing event that describes this opcommon.MetaId id
should really be the unique identifier for this resource op. Since this resource op is represented by an event in tracing, not a span it does not have a span id. This is why I have used the metadata id for the purpose as well. Not sure whether this is correct as I am not sure this will actually be unique. Namely, there could be two invocations of a resource op that have the exact metadata and will produce the very same metadata id. Need to revisit that. Also I am not too sure we actually need any unique identifier for these. Suggestions welcome.common.SpanId resource_id
is the id of the resource. Any resource operation should be attached to a resource. Like if you callreset
on a timer, this will produce a ResourceOp and the resource id will be the id of the main resource span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
common.MetaId id
should really be the unique identifier for this resource op. Since this resource op is represented by an event in tracing, not a span it does not have a span id. This is why I have used the metadata id for the purpose as well. Not sure whether this is correct as I am not sure this will actually be unique. Namely, there could be two invocations of a resource op that have the exact metadata and will produce the very same metadata id. Need to revisit that. Also I am not too sure we actually need any unique identifier for these. Suggestions welcome.
This will not be unique; it will be shared across all ResourceOp
events originating from the same tracing::event!
macro. In fact, ResourceOp.id
will always be equal to ResourceOp.metadata
.
I think this field should probably just be removed entirely. Since these resource ops correspond to single events, and we won't emit more data about a specific resource op more than once, I don't think they need a unique identifier. If we do want them to have unique identifiers, we should probably just generate them with a global counter in the subscriber.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got through aggregator and lib. On the whole, looks good to me.
console-subscriber/src/aggregator.rs
Outdated
task_stats: IdData::default(), | ||
resources: IdData { | ||
data: HashMap::<span::Id, (Resource, bool)>::new(), | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can each of these just be IdData::default()
? It would make these lines less busy.
console-subscriber/src/aggregator.rs
Outdated
@@ -543,6 +787,21 @@ impl<T> TaskData<T> { | |||
fn get(&self, id: &span::Id) -> Option<&T> { | |||
self.data.get(id).map(|(data, _)| data) | |||
} | |||
|
|||
fn as_proto(&mut self, updated_only: bool) -> HashMap<u64, T::Result> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest not using a boolean argument, when I was reading the usage elsewhere in the file, it's hard to know what tasks.as_proto(false)
means. Perhaps an enum would make it clearer, like enum Include { All, UpdatedOnly }
or something.
Also, since this creates a new hashmap, I'd suggest the name fn to_proto
instead of as_proto. The as
prefix sounds like it's getting a reference and thus would be cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems clippy is not happy about the to_proto with &mut self:
error: methods with the following characteristics: (`to_*` and `self` type is not `Copy`) usually take `self` by reference
--> console-subscriber/src/aggregator.rs:823:17
|
823 | fn to_proto(&mut self, include: Include) -> HashMap<u64, T::Result>
| ^^^^^^^^^
|
= note: `-D clippy::wrong-self-convention` implied by `-D warnings`
= help: consider choosing a less ambiguous name
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#wrong_self_convention
console-subscriber/src/lib.rs
Outdated
} else if meta.name() == "runtime.resource" { | ||
self.resource_callsites.insert(meta); | ||
} else if meta.name() == "runtime.async_op" { | ||
self.async_op_callsites.insert(meta); | ||
} else if meta.target() == "tokio::resource::op" { | ||
self.resource_op_callsites.insert(meta); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there likely more than 32 ops in Tokio? That doesn't sound crazy... Perhaps Callsites
could get a const generic for the size of the array?
console-subscriber/src/visitors.rs
Outdated
|
||
fn record_str(&mut self, field: &tracing_core::Field, value: &str) { | ||
let extract_partial_attr = || -> Option<MatchPart> { | ||
let mut parts = field.name().split('_'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally, we would want to use .
instead of _
but the tracing macro had some problem and did not allow me to emit events that have fields that are bot valid and invalid rust identifiers (the invalid ones being quoted ""
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, can you provide details on the issues with the tracing macro? that seems like a bug we should fix...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will put together an example shortly
@hawkw did you want to take a look? It's not required, I'm happy to merge this, I think the direction is largely right, and we can address things in future patches. |
I do really want to finish giving this a review before it's merged. Sorry, I've been busy; will try to look at it today. If I don't get through it by EOD we can just merge and roll forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, sorry this took me such a long time to review --- there's a lot of code in this change, so I wanted to read over it thoroughly.
i commented on a bunch of minor style nits. most of those are not hard blockers, but since this is such a large change, it would be nice to simplify the new code where possible. but, feel free to ignore those suggestions, or address them in follow-up branches when this lands.
the main things i want to address before this merges are:
- i had a number of questions about the new proto types, and i want to make sure i completely understand them before this merges --- either through talking about them in the PR discussion or (preferably) by adding new comments to the proto definitions.
- the way numeric values associated with resources are sent over the wire as diffs feels unnecessarily complex (provided that i'm understanding it correctly?). can we simplify this at all? i understand that the events being recorded don't have a way of tracking the previous data and summing them, but it would be better, IMO, not to include this in the wire proto. instead, can we make the aggregator handle adding/subtracting/replacing tracked attribute values and only send the current value on the wire? that would be more consistent with the way we record all other numeric values.
- in general, anything we can simplify or anywhere we can reuse existing code (such as for field names) would be great.
console-subscriber/src/aggregator.rs
Outdated
first_poll: Option<SystemTime>, | ||
last_poll_started: Option<SystemTime>, | ||
last_poll_ended: Option<SystemTime>, | ||
busy_time: Duration, | ||
} | ||
|
||
// Represent static data for resources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Represent static data for resources | |
/// Represents static data for resources |
console-subscriber/src/aggregator.rs
Outdated
} | ||
struct PollStats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: prefer whitespace between definitions
} | |
struct PollStats { | |
} | |
struct PollStats { |
console-subscriber/src/aggregator.rs
Outdated
current_polls: u64, | ||
created_at: Option<SystemTime>, | ||
polls: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between current_polls
and polls
? I would probably add a comment to this, since it's not immediately obvious
console-subscriber/src/aggregator.rs
Outdated
@@ -1,7 +1,10 @@ | |||
use crate::WatchRequest; | |||
use crate::{AttributeUpdate, WatchRequest}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an unimportant nitpick, but...wow, there's a lot of code in this file now. I might want to consider breaking some of this up into separate modules. but, we can do that in a follow-up branch.
console-subscriber/src/aggregator.rs
Outdated
let attributes = self | ||
.attributes | ||
.values() | ||
.cloned() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/tioli: i think we should be able to remove the cloned()
here, since to_proto
takes a reference:
.cloned() |
console-subscriber/src/aggregator.rs
Outdated
fn serialize_histogram(histogram: &Histogram<u64>) -> Result<Vec<u8>, V2SerializeError> { | ||
let mut serializer = V2Serializer::new(); | ||
let mut buf = Vec::new(); | ||
serializer.serialize(histogram, &mut buf)?; | ||
Ok(buf) | ||
} | ||
|
||
fn total_time(created_at: Option<&SystemTime>, closed_at: Option<&SystemTime>) -> Option<Duration> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/tioli: SystemTime
implements Copy
, so it's probably more important to pass these by value, rather than by reference:
fn total_time(created_at: Option<&SystemTime>, closed_at: Option<&SystemTime>) -> Option<Duration> { | |
fn total_time(created_at: Option<SystemTime>, closed_at: Option<SystemTime>) -> Option<Duration> { |
console-subscriber/src/aggregator.rs
Outdated
fn serialize_histogram(histogram: &Histogram<u64>) -> Result<Vec<u8>, V2SerializeError> { | ||
let mut serializer = V2Serializer::new(); | ||
let mut buf = Vec::new(); | ||
serializer.serialize(histogram, &mut buf)?; | ||
Ok(buf) | ||
} | ||
|
||
fn total_time(created_at: Option<&SystemTime>, closed_at: Option<&SystemTime>) -> Option<Duration> { | ||
closed_at.and_then(|end| created_at.and_then(|start| end.duration_since(*start).ok())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, take it or leave it: this might be clearer, what do you think?
closed_at.and_then(|end| created_at.and_then(|start| end.duration_since(*start).ok())) | |
let end = closed_at?; | |
let start = created_at?; | |
end.duration_since(start).ok() |
console-subscriber/src/aggregator.rs
Outdated
tasks.len = entities_len_1, | ||
stats.dropped = dropped_stats, | ||
stats.tasks = stats_len_1, | ||
"dropped closed entities" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit/tioli: should maybe also be
"dropped closed entities" | |
"dropped closed {}", | |
std::any::type_name::<R>(), |
console-subscriber/src/aggregator.rs
Outdated
tracing::trace!( | ||
entities.len = entities_len_1, | ||
stats.len = stats_len_1, | ||
"no closed entities were droppable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"no closed entities were droppable" | |
"no closed {} were droppable", | |
std::any::type_name::<T>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively, we could put something like this at the top of the method:
let _span = tracing::debug_span!(
"drop_closed",
entity = %std::any::type_name::<T>(),
stats = %std::any::type_name::<R>(),
).entered();
and then not have to add the type names to the individual trace events.
up to you!
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking very close to ready. I had some additional comments.
The main things I think are important to address before merging this are:
- my comment about potentially simplifying the way
StateUpdate
attributes are recorded astracing
fields - using targets rather than event names to recognize events (since
tracing
doesn't currently allow overriding the name of an event)
The remaining suggestions are mostly style nits or otherwise not blockers.
@@ -188,3 +189,31 @@ impl From<&dyn std::fmt::Debug> for field::Value { | |||
field::Value::DebugVal(format!("{:?}", val)) | |||
} | |||
} | |||
|
|||
#[allow(clippy::derive_hash_xor_eq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is required because prost
generates a struct with a PartialEq
impl, but we can't add derive(Hash)
to that struct. It might be worth adding a comment explaining why we allow this lint:
#[allow(clippy::derive_hash_xor_eq)] | |
// Clippy warns when a type derives `PartialEq` but has a manual `Hash` impl, | |
// or vice versa. However, this is unavoidable here, because `prost` generates | |
// a struct with `#[derive(PartialEq)]`, but we cannot add`#[derive(Hash)]` to the | |
// generated code. | |
#[allow(clippy::derive_hash_xor_eq)] |
console-subscriber/Cargo.toml
Outdated
@@ -25,4 +24,4 @@ serde_json = "1" | |||
[dev-dependencies] | |||
|
|||
tokio = { version = "^1.7", features = ["full", "rt-multi-thread"]} | |||
futures = "0.3" | |||
futures = "0.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unrelated trailing whitespace change?
futures = "0.3" | |
futures = "0.3" | |
impl<T> Default for IdData<T> { | ||
fn default() -> Self { | ||
IdData { | ||
data: ShrinkMap::<Id, (T, bool)>::new(), | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: in general, we tend to put type definitions above method implementations...can we move the Default
impl below the // === impl IdData ===
comment on line 25?
/// *New* PollOp events that whave occured since the last update | ||
/// | ||
/// This is emptied on every state update. | ||
new_poll_ops: ShrinkVec<proto::resources::PollOp>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because this is emptied every time we send an update, this really doesn't need to be a ShrinkMap
. every time we send an update, we will replace
the vector with a new empty Vec
, so it doesn't grow constantly over the entire lifetime of the program.
This is why new_metadata
is a regular Vec
rather than a ShrinkMap
.
/// This is emptied on every state update. | ||
new_poll_ops: ShrinkVec<proto::resources::PollOp>, | ||
|
||
task_ids: Ids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: not a blocker, but I have a slight preference for grouping this with the other task related fields, so that the struct is ordered like
Aggregator {
...
tasks,
task_stats,
task_ids,
resources,
resource_stats,
async_ops,
async_op_stats,
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry I missed renaming this. I will rename this to just ids
because this is used now for not just task ids but also resource and async op ids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I would probably give those separate lists of IDs, but we can figure out whether we want to do that or not later...
console-subscriber/src/visitors.rs
Outdated
/// tracing::trace!( | ||
/// target: "tokio::resource::state_update", | ||
/// duration = "attribute_name", | ||
/// value = 10, | ||
/// unit = "ms", | ||
/// op = "ovr", | ||
/// ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way that state updates are recorded currently feels a little overly complex to me. We are looking for four separate fields named attribute_name
, value
, op
, and unit
, but tracing
already has a mechanism for representing key-value pairs. Why not simply look for fields named unit
and op`, and then treat any remaining field as the value, and use its name as the name of the attribute?
We could potentially even do something a bit fancier to allow a single event to update multiple resource states, if we looked for clusters of fields named <NAME>
, <NAME>.unit
, and <NAME>.op
? That way, we could record one event like
tracing::trace!(
target: "runtime::resource::state_update",
read = 256, read.unit = "bytes", read.op = "add",
to_write = 1024, to_write.unit = "bytes", to_write.op = "sub",
);
or similar?
This might be slightly overengineered, but it seems like it would be nice to reuse the existing mechanism in tracing
, and in the wire proto, for representing field names. In particular, if we do it this way, we can just use the existing proto format for representing field names as metadata indices, instead of having to add a new concept of "attributes", which are like tracing fields but also not the same...
console-subscriber/src/visitors.rs
Outdated
fn record_bool(&mut self, field: &field::Field, value: bool) { | ||
if field.name() == Self::OP_STATE_FIELD_TYPE_VALUE { | ||
self.val = Some(PbFieldValue::BoolVal(value)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the value is a bool, we should assume the op
is "overwrite", correct? since "add" and "sub" don't really make sense for booleans. perhaps we should only require an "op" field when the value is numeric? i notice we already allow recording state updates without unit
fields...
console-subscriber/src/visitors.rs
Outdated
} | ||
|
||
impl Visit for StateUpdateVisitor { | ||
fn record_debug(&mut self, _: &field::Field, _: &dyn std::fmt::Debug) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also allow recording Debug
fields for state updates? I can imagine a resource wanting to include some random bit of internal data as part of its tracked state. if we support Debug
here we should probably assume that there's no unit
and that the op
is always "overwrite", the way we do for bool values...
match stats.attributes.get_mut(&upd_key) { | ||
Some(attr) => update_attribute(attr, update), | ||
None => { | ||
stats.attributes.insert(upd_key, update.into()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: i think it might be slightly nicer if we did this using the HashMap::entry
API:
match stats.attributes.get_mut(&upd_key) { | |
Some(attr) => update_attribute(attr, update), | |
None => { | |
stats.attributes.insert(upd_key, update.into()); | |
} | |
stats.attributes | |
.entry(upd_key) | |
.and_modify(|attr| update_attribute(attr, update)) | |
.or_insert_with(|| update.into()); |
This has the advantage of only doing a single hashmap lookup, instead of two (one for get_mut
and a second one for insert
if the key is not present).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that requires cloning the update
as it is moved into the and_modify
closure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I believe we could alternatively still use a match
with the entry
API, like this:
match stats.attributes.entry(upd_key) {
Entry::Occupied(ref mut attr) => {
update_attribute(attr.get_mut(), update);
}
Entry::Vacant(attr) => {
attr.insert(update.into());
}
}
This still avoids performing a second lookup.
Obviously, not a blocker!
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, this looks good to me, I think we can address the remaining things in subsequent branches.
thanks a lot for all the time you've spent on this, I know it was a big change. :)
console-subscriber/src/visitors.rs
Outdated
fn record_debug(&mut self, field: &field::Field, value: &dyn std::fmt::Debug) { | ||
self.field = Some(proto::Field { | ||
name: Some(field.name().into()), | ||
value: Some(value.into()), | ||
metadata_id: Some(self.meta_id.clone()), | ||
}); | ||
} | ||
|
||
fn record_i64(&mut self, field: &field::Field, value: i64) { | ||
if field.name() == Self::OP_STATE_FIELD_TYPE_VALUE { | ||
self.val = Some(PbFieldValue::I64Val(value)); | ||
} | ||
self.field = Some(proto::Field { | ||
name: Some(field.name().into()), | ||
value: Some(value.into()), | ||
metadata_id: Some(self.meta_id.clone()), | ||
}); | ||
} | ||
|
||
fn record_u64(&mut self, field: &field::Field, value: u64) { | ||
if field.name() == Self::OP_STATE_FIELD_TYPE_VALUE { | ||
self.val = Some(PbFieldValue::U64Val(value)); | ||
} | ||
self.field = Some(proto::Field { | ||
name: Some(field.name().into()), | ||
value: Some(value.into()), | ||
metadata_id: Some(self.meta_id.clone()), | ||
}); | ||
} | ||
|
||
fn record_bool(&mut self, field: &field::Field, value: bool) { | ||
if field.name() == Self::OP_STATE_FIELD_TYPE_VALUE { | ||
self.val = Some(PbFieldValue::BoolVal(value)); | ||
} | ||
self.field = Some(proto::Field { | ||
name: Some(field.name().into()), | ||
value: Some(value.into()), | ||
metadata_id: Some(self.meta_id.clone()), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should probably all check that the field doesn't end with the unit
or op
suffixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, just pushed an update
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
Based on some [earlier feedback][1], this PR changes the proto definition so readiness of a poll op is represented via a bool. The relevant changes have been made to: tokio-rs/tokio#4072 [1]: #77 (comment) Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This branch instruments the `Sleep` resource to allow the tokio-console to consume data about resources usage. The corresponding console branch is here: tokio-rs/console#77 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
This branch instruments the `Sleep` resource to allow the tokio-console to consume data about resources usage. The corresponding console branch is here: tokio-rs/console#77 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
The current logic present in `IdData::drop_closed` marks an item (task, resource, and async op stats) to be dropped in the case that the item **is** dirty and there **are** watchers: `(dirty && has_watchers)`. This causes a case where if an item is first received and then completes in between the aggregator push cycle, it will be discarded immediately and never sent. This logic has been in place since the concepts of watchers and dirty items was introduced in #77. However since an item that is created and then dropped within a single update cycle isn't likely to be missed in the UI, it may never have been noticed. Instead the logic should be to **retain** an item if **any** of the following is true: * there are watchers and the item is dirty: `(dirty && has_watchers)` * item has been dropped less time than the retention period: `dropped_for <= retention`.
The current logic present in `IdData::drop_closed` marks an item (task, resource, and async op stats) to be dropped in the case that the item **is** dirty and there **are** watchers: `(dirty && has_watchers)`. This causes a case where if an item is first received and then completes in between the aggregator push cycle, it will be discarded immediately and never sent. This logic has been in place since the concepts of watchers and dirty items was introduced in #77. However since an item that is created and then dropped within a single update cycle isn't likely to be missed in the UI, it may never have been noticed. Instead the logic should be to **retain** an item if **any** of the following is true: * there are watchers and the item is dirty: `(dirty && has_watchers)` * item has been dropped less time than the retention period: `dropped_for <= retention`.
The current logic present in `IdData::drop_closed` marks an item (task, resource, and async op stats) to be dropped in the case that the item **is** dirty and there **are** watchers: `(dirty && has_watchers)`. This causes a case where if an item is first received and then completes in between the aggregator push cycle, it will be discarded immediately and never sent. This logic has been in place since the concepts of watchers and dirty items was introduced in #77. However since an item that is created and then dropped within a single update cycle isn't likely to be missed in the UI, it may never have been noticed. Instead the logic should be to **retain** an item if **any** of the following is true: * there are watchers and the item is dirty: `(dirty && has_watchers)` * item has been dropped less time than the retention period: `dropped_for <= retention`.
This PR adds the pieces needed to get resource instrumentation data
into the console. This includes:
The UI part will come as a follow up PR. This branch uses a patched tokio
that emits these tracing spans and events for the Sleep resource. You can look
at the raw data by running:
cargo run --example app
cargo run --example dump
The information piped through includes:
Signed-off-by: Zahari Dichev zaharidichev@gmail.com