Stop storing derived metrics #183

axw · 2020-07-30T05:44:01Z

We can derive them in Elasticsearch instead; delete the "numbers" package, use inline arithmetic.

Also, fixed a bug with the "total events sent - success %" metric reported to stdout, which mistakenly used an errors count instead of events count.

We can derive them in Elasticsearch instead. Delete the "numbers" package, inline simple arithmetic.

apmmachine · 2020-07-30T05:48:06Z

💚 Build Succeeded

Expand to view the summary

Build stats

Build Cause: [Pull request #183 opened]
Start Time: 2020-07-30T05:44:09.738+0000
Duration: 4 min 44 sec

Test stats 🧪

Test	Results
Failed	0
Passed	9
Skipped	0
Total	9

jalvz · 2020-08-04T09:58:02Z

Thinking it trough, I think I disagree with this change.
It is true we can get all that from Elasticsearch, it was actually considered initially. But given the (imo negligible) complexity of indexing those fields and storage costs, it pays off at query time.
A trivial _search query would give me all the data I want without having to craft those calculations in a query body (accounting for optional values, zero divisions, etc).
Now if I need to dig beyond dashboards I need to write arbitrarily complicated queries instead.

simitt · 2020-08-04T11:37:28Z

I might not be the heaviest user of the hey-apm benchmarking, but I have barely used the derived metrices; if so I had to look up how they were calculated as it wasn't necessarily clear to me what the difference/timeunit etc for the metrics were.

@jalvz could you share where/how you were using these values?

jalvz · 2020-08-04T11:40:46Z

Any time I queried the actual docs instead of using the dashboards. That is every time I want granular data or I am looking for something specific. Last time was during the recent spike caused by enabling the instrumentation

axw · 2020-08-04T12:15:14Z

A trivial _search query would give me all the data I want without having to craft those calculations in a query body (accounting for optional values, zero divisions, etc).
...
Any time I queried the actual docs instead of using the dashboards. That is every time I want granular data or I am looking for something specific. Last time was during the recent spike caused by enabling the instrumentation

I've been thinking of these metrics as only useful for comparisons, rather than looking at in isolation. Can you describe how you were using them in isolation? Or if you were using them for comparison, where the dashboards fell short and looking at raw docs helped.

Just trying to understand why we can't use dashboards + kuery bar.

jalvz · 2020-08-04T12:55:07Z

how you were using them in isolation?

If I have a single run from a PR I can't just use the dashboards because results are averaged. So I'd query individual fields from an individual document (I grab a reportId from the log to filter by) and then compare it with the latest average from the related dashboard.

why we can't use dashboards + kuery bar

I think we can do that the same now as before this change. The difference is that now we can't query individual documents (or rather we can, but it is quite more difficult)

I also don't understand what specific problem is this solving, besides doing the same thing in a different way (and with less lines of code, but lines of code are not a problem per se)

axw · 2020-08-05T02:31:58Z

If I have a single run from a PR I can't just use the dashboards because results are averaged.
...
I think we can do that the same now as before this change. The difference is that now we can't query individual documents (or rather we can, but it is quite more difficult)

We can't do it the exact same way (which implies no change at all), but we can do this with a data table visualisation and scripted fields (until Lens supports bucket_script - elastic/kibana#67405). I've added one to the dashboard, let me know if this does what you need or what's missing.

I also don't understand what specific problem is this solving, besides doing the same thing in a different way (and with less lines of code, but lines of code are not a problem per se)

There were a couple of things that set this off:

Non-idiomatic code, e.g. func Div(i1, i2 interface{}) *float64, and unnecessary code, e.g. func Sum(xs ...uint64) uint64
Many metrics with no clear value, making it more difficult (for me) to identify the ones that provide the greatest signal

If I knew you were using the derived metrics at the individual report level (code comments explaining why we're not using ES aggs would have helped) then I wouldn't have gone and removed them; I would have just cleaned up the way they were calculated.

Anyway, as above, please let me know what's missing. We can add them back in if really needed.

jalvz · 2020-08-05T07:58:57Z

we can do this with a data table visualisation and scripted fields

I think that will work - and if not we will figure something else out, thanks

* upstream/main: (37 commits) Update to assume use of data streams (elastic#198) JJBB: disable periodic trigger (elastic#195) es: only count workload events (elastic#193) Use specific docker compose context (elastic#191) define failure threshould from average instead of max (elastic#190) update go agent version (elastic#189) Add destination service to spans (elastic#188) worker: reinstate Events{Generated,Sent,Indexed} (elastic#187) Remove types, conv, strcoll packages (elastic#185) agent/tracer cleanup (elastic#184) Stop storing derived metrics (elastic#183) Wait for active output events before proceeding (elastic#181) worker: send spans in a separate goroutine (elastic#180) benchmark: drop duplicate benchmark (elastic#179) Refactoring, context propagation, graceful stop (elastic#177) Keep self instrumentation (elastic#174) Disable apm-server instrumentation (elastic#173) Upgrade go version and dependencies (elastic#172) Allow configuring multiple worker instances (elastic#171) Adds a nil check before querying libbeat metrics. (elastic#169) ...

Stop storing derived metrics

c3657fd

We can derive them in Elasticsearch instead. Delete the "numbers" package, inline simple arithmetic.

axw merged commit 799c1df into elastic:master Jul 30, 2020

axw deleted the no-derived-metrics branch July 30, 2020 06:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop storing derived metrics #183

Stop storing derived metrics #183

axw commented Jul 30, 2020

apmmachine commented Jul 30, 2020

Build stats

Test stats 🧪

jalvz commented Aug 4, 2020

simitt commented Aug 4, 2020

jalvz commented Aug 4, 2020

axw commented Aug 4, 2020 •

edited

Loading

jalvz commented Aug 4, 2020

axw commented Aug 5, 2020

jalvz commented Aug 5, 2020

Stop storing derived metrics #183

Stop storing derived metrics #183

Conversation

axw commented Jul 30, 2020

apmmachine commented Jul 30, 2020

💚 Build Succeeded

Build stats

Test stats 🧪

jalvz commented Aug 4, 2020

simitt commented Aug 4, 2020

jalvz commented Aug 4, 2020

axw commented Aug 4, 2020 • edited Loading

jalvz commented Aug 4, 2020

axw commented Aug 5, 2020

jalvz commented Aug 5, 2020

axw commented Aug 4, 2020 •

edited

Loading