-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
service metrics: add numeric outcome #5243
Comments
|
@axw yes, sorry, that is the idea. I don't really have a good suggestion for the name of the field though 😀 |
@dgieselaar I updated the original description to what we have planned now for |
@felixbarny @dgieselaar would either of you be able to help clarify the definition a bit please? What values do we populate, and when? It might help if we knew how the fields would be aggregated, for each of transaction events, transaction metrics, and service metrics. |
done, let me know if something is missing or unclear |
For service metrics, are we using the same field |
correct
That's also correct. This lets us re-use the same query across all three. That works because the But please validate that assumption by trying out the avg aggregation on |
@axw mentioned that since transaction metrics and service metrics are sharing the same |
@felixbarny A question about the Count in aggregate_metric_double:
Should value_count be number of success + failure, or success + failure + unknown? I assume it is the former (which is what you wrote) but just wanted to confirm. |
Correct, it's the former. See https://github.com/elastic/apm/blob/main/specs/agents/tracing-transactions.md#error-rate for more details. |
@dgieselaar IIANM the main reason for storing as a I would strongly prefer that service metrics remain a subset of transaction metrics so that we could later derive them in Elasticsearch by just dropping dimensions. So I'm +1 on storing an WDYT @felixbarny? |
I definitely agree with this statement. I just wasn't sure if it's feasible to use I played around with the mapping a bit and it seems like it is feasible and actually better than mapping it to byte. We can map it like this:
Click to expand example
As the metrics will be so similar in structure, we need to take care that the UI isn't accidentally querying both transaction metrics and service metrics at the same time. Did you already think about what's the best way to do that? Just a query for the |
Thanks @felixbarny, I think that looks good. The other thing that's been making me feel uncomfortable is the field name. It feels a bit too abstract, and implies that it is equivalent to Seems similar to
WDYT? Not my favourite name, but I think a little better...
I would prefer to split them out, but I would query on |
@axw @felixbarny we will likely use metricset.name but it'd be great if that can be a constant keyword. |
@felixbarny and I just had a chat about the metrics. We're thinking that we should skip adding them to transaction events and transaction metrics for now, and only add them to service metrics. That's because we would still need a fallback for old documents, and we don't yet have a solution for schema evolution. @dgieselaar does that sound reasonable? Regarding the name:
So for a concrete suggestion, I'll amend the above from |
@axw There's already a success_count (as well as a failure_count) under transaction which is an int: https://github.com/elastic/apm-server/blob/main/apmpackage/apm/data_stream/internal_metrics/fields/fields.yml#L132 Do we need a new name or a migration for the existing field? Am I missing anything here? |
@carsonip that field is only used by the existing service metrics implementation, which is technical preview and off by default. We can make a breaking change to it if we decide on that name. (If we don't decide on that name, then we would just remove the field definition.) |
Thanks for the explanation. I'll proceed to implement it with |
@carsonip yes, we would remove |
I just had a chat with @sqren about this and we concluded that it might be better to add the new fields consistently now, even if we don't have a good solution for schema evolution. The sooner we add the changes, the sooner we're able to deprecate the old schema. I think there is a potential path for schema evolution with runtime fields, however, |
Do we have enough information to do the same for transaction events and transaction metrics now? Will we also use an |
My understanding is that everybody was fine with using |
The name should be the same for transaction events, transaction metrics, and service metrics. So it would be
I think the name should be optimized for the usage on individual events. IMHO, it's fine if the name is a little weird in the context of a metric that represent multiple individual events. When you configure rollups in ES, it would also use the Sorry for the back-and-forth with the naming, Carson. For now, I'd say stick to |
Thanks @simitt @felixbarny. No problem. It should be straightforward to implement given that the rest of it is similar to #9791. |
As pointed out by @axw, |
During APM server weekly, @felixbarny and the team made the decision that we will be using |
Add `event.success_count` to transaction event and transaction metrics. Change `transaction.success_count` to `event.success_count` for service metrics. closes #5243
Add a numeric outcome to
byte
)byte
as it has anevent.outcome
dimension)aggregate_metric_double
, as it has noevent.outcome
dimension).All fields should be mapped as doc-value only fields.
The concrete field name needs to be defined. The working title is
event.outcome_numeric
.This will help with reducing dimensions while keeping the structure of transaction metrics and service metrics similar.
Mapping of
event.outcome
toevent.outcome_numeric
event.outcome
event.outcome_numeric
unknown
null
failure
success
For the
aggregate_metric_double
, the following sub-fields should be populated:sum
: the number of success eventsvalue_count
: the number of success + error eventsWe can then simply do an average aggregation over the
event.outcome_numeric
to get the success rate. We need to validate that the avg aggregation works correctly on both thebyte
andaggregate_metric_double
mapped fields.Original request:
Right now we use a terms aggregation or multiple filter aggregations on event.outcome to calculate transaction error rate. If we store failure as 0 and success in 1 in a single field (and not populate it for unknown), we can get the percentage by doing an avg aggregation on that single field. This would simplify our queries and the post processing that we are doing, will probably be a small performance improvement, and would make it easier for users to build custom dashboards.
The text was updated successfully, but these errors were encountered: