Handle date histogram scaling for table vis and avg_buckets metric #11929

trevan · 2017-05-19T18:53:20Z

The line/area/bar vis handles time scaling when you select an interval that creates too many buckets. But table vis and the new avg_metric with a date histogram doesn't.

I moved the scaling code to the tabify code to handle both table vis and line vis. I'm not sure how to handle two date histograms in a table that both cause scaling. In that case, the scaling multiplier would be wrong as they are based on the main timespan. I'm not sure if it is worth handling, though.

For the avg_metric, I overrode the getValue to check if scaling is required by the customBucket. I couldn't use the "metricScale" value on the customBucket like the tabify code does because that checks that all metrics are count and sum. Since the avg_metric isn't count or sum, it will always cause it to be false. Instead, I check that customMetric is count or sum and then add the scaling. This won't work for nested sibling metrics where the avg_metric is nested inside of the max or min bucket but I think that might require a bit more work to get the scaling to take place since you'll have to ignore the result from elasticsearch and almost recalculate the pipeline.

elasticmachine · 2017-05-19T18:53:22Z

Can one of the admins verify this patch?

trevan · 2017-05-19T18:54:00Z

@nreese, this should fix the issue that I mentioned in #4646

trevan · 2017-05-19T18:54:48Z

Another possibility is to get rid of the scaling if using the table or avg_bucket metric. I don't know if the scaling exists for ES issues or just because of browser rendering issues.

thomasneirynck · 2017-05-19T20:39:48Z

jenkins, test this

trevan · 2017-05-20T03:29:05Z

There is a test that is failing in master as well. It is on confirm_modal test. I've fixed the other ones.

thomasneirynck · 2017-05-23T15:06:47Z

jenkins, test this

ppisljar · 2017-06-06T07:06:04Z

Another possibility is to get rid of the scaling if using the table or avg_bucket metric. I don't know if the scaling exists for ES issues or just because of browser rendering issues.

@trevan i taught that is the current scenario ? (that we don't do scaling for these ?)

ppisljar · 2017-06-06T07:08:24Z

also, if there is an issue with avg_bucket i bet the same applies to other bucket aggs (min, max, sum)

ppisljar · 2017-06-06T07:10:56Z

i just checked the bucket aggs ... i set aggregation to date histogram, interval to millisecond, click play then in the spy panel i check the request and i can see :

  "aggs": {
    "1": {
      "avg_bucket": {
        "buckets_path": "1-bucket>_count"
      }
    },
    "1-bucket": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "30m",
        "time_zone": "Europe/Berlin",
        "min_doc_count": 1
      }
    }
  },

so the scaling does apply even for bucket aggs (without this PR) .... ?

ppisljar · 2017-06-06T07:12:21Z

also in the data table scaling seems to be applied correctly (before this PR) ?

trevan · 2017-06-06T16:08:15Z

@ppisljar, I'm not talking about modifying the query to send a different bucket interval. I'm talking about scaling the results so that if I request for bytes/second, I get bytes/second even if the bucket interval has changed to 30m.

If you do a line chart where the interval has to be scaled, the resulting line will be the original requested value (bytes/second instead of bytes/fixed_time_interval). But table chart doesn't do that, nor does the pipeline aggregations. In #4646, someone requested to see the average bytes/second in a large timespan that causes it to change the bucket size, then instead of getting average bytes/second, you'll get average bytes/fixed_time_interval.

ppisljar · 2017-06-07T09:00:58Z

thanks for clarification @trevan, will try to play with this today

thomasneirynck · 2017-06-16T19:44:48Z

src/ui/public/agg_response/tabify/tabify.js

@@ -61,14 +63,17 @@ export function AggResponseTabifyProvider(Private, Notifier) {
        }
        break;
      case 'metrics':
-        const value = agg.getValue(bucket);
+        let value = agg.getValue(bucket);
+        if (aggScale !== 1) {


I'd remove if-statement

I should probably put a comment here but the if statement is there to prevent "scaling" non numbers. Since only "count" and "sum" are allowed to scale, if you pick a different metric (such as top_hits) and don't have this if statement, then it could multiple a string with 1 and get a NaN. By only scaling if the scale has changed, it is guaranteed that the value to be scaled is a number.

Hope that makes sense :)

now it does :)

thomasneirynck

thanks @trevan, this is addressing a gnarly issue

Overall, I'm in favor of this change. It works around a limitation we introduced because we don't want to overload ES with too large of a query or Kibana with too large of a result. This is a technical limitation that isn't all that relevant to the end-user.

But it is somewhat of an obscure feature. And already Kibana can surprise users with all the normalization that is going on in the background. For example, the auto-correct of the interval is one of them. With this PR, Kibana actually tries to "correct" this, after Kibana already "corrected" it, but only for aggregation-types that we can reasonably do this (e.g. taking an average of counts is something we can do, but not an average of averages). Just typing this makes my head spin ;)

So when we go down this road, we need to communicate this clearly.

I think the easiest way to do this would be to make the label more explicit and add have the denominator in the default title.

So Kibana table's column title or y-axis label would be something like:

Overall Average of count per second, or whatever the metric it is that Kibana has normalized.

I am not quite sure what you are referring to when talking about this happening in line charts/bar charts. Could you expand on this?

e.g. below is a before and after of a line chart doing an average count per second.

before:

after:

Given the changes in this PR, it'd also needs some tests.

thomasneirynck · 2017-06-16T19:46:21Z

src/ui/public/agg_types/metrics/bucket_avg.js

+    ],
+    getValue: function (agg, bucket) {
+      const customMetric = agg.params.customMetric;
+      const scalingMetric = customMetric.type && (customMetric.type.name === 'count' || customMetric.type.name === 'sum');


This is dense. This needs some inline comment to explain why we will scale only for counts and sums.

I copied that line from https://github.com/elastic/kibana/blob/master/src/ui/public/agg_types/buckets/date_histogram.js#L116. I'm not sure why Kibana only scales for counts and sums so I just left it as is.

could you extract to function? it will show the relationship

I created an "isScalable" function on the metric type and count and sum return true.

thomasneirynck · 2017-06-16T19:54:26Z

src/ui/public/agg_types/metrics/bucket_avg.js

+      const scalingMetric = customMetric.type && (customMetric.type.name === 'count' || customMetric.type.name === 'sum');
+
+      let value = bucket[agg.id] && bucket[agg.id].value;
+      if (value && scalingMetric) {


You probably can just remove the value from the check (?). this sort of works due to the falsy nature of 0, and it doesn't need scaling.

trevan · 2017-06-16T21:52:41Z

@thomasneirynck, here's a line chart showing how it works on a non modified Kibana.

If you look at the chart, the maximum it shows is 10 but if you look at the spy table at the bottom, it shows it goes over 10. That is because I asked for 1 minute buckets but Kibana requested 30 minute buckets.

If I do this same visualization as a bar or area, it would also do the conversion.

But if I do a table visualization, it should show the value for the 30 minute buckets but say that it is showing per minute (compare the column header to the actual data):

I totally agree that the metric name for sibling aggregations that are using a date histogram bucket should show what timespan is being used. I think that is a separate issue from this but I could try and fix it.

thomasneirynck

Thx @trevan, apologies for the delay on this one. I think this is good addition to table. @ppisljar, do you want to take a second look?

As for improving the labeling, OK, we can wait on that. It's a similar problem as here #12816, so let's go through that one first before making any more changes to that.

thomasneirynck · 2017-07-17T21:31:13Z

src/ui/public/agg_types/metrics/bucket_avg.js

+    ],
+    getValue: function (agg, bucket) {
+      const customMetric = agg.params.customMetric;
+      const scalingMetric = customMetric.type && (customMetric.type.name === 'count' || customMetric.type.name === 'sum');


could you extract to function? it will show the relationship

trevan · 2017-07-18T16:51:41Z

I rebased against master

ppisljar · 2017-07-19T08:29:27Z

seems to work well, i really like that the spy panel values match the chart values when scaling happens.
there is an issue with label, for example when column header is saying timespan per minute (but its actually per 30 minutes or sth) ... but this is something we are addressing in another PR #12816

ppisljar

LGTM

ppisljar · 2017-07-19T08:30:16Z

@thomasneirynck @trevan should we backport ?

trevan · 2017-07-19T14:19:28Z

@ppisljar, I don't need it backported.

trevan changed the title ~~WIP: Handle date histogram scaling for table vis and avg_buckets metric~~ Handle date histogram scaling for table vis and avg_buckets metric May 19, 2017

thomasneirynck added the Feature:Visualizations Generic visualization features (in case no more specific feature label is available) label May 19, 2017

thomasneirynck assigned thomasneirynck and unassigned thomasneirynck May 23, 2017

thomasneirynck self-requested a review May 23, 2017 15:06

thomasneirynck requested a review from ppisljar May 31, 2017 14:51

thomasneirynck reviewed Jun 16, 2017

View reviewed changes

thomasneirynck added the v6.0.0-alpha1 label Jun 16, 2017

thomasneirynck added v6.0.0-beta1 release_note:fix review and removed v6.0.0-alpha1 v6.0.0-beta1 labels Jul 7, 2017

thomasneirynck reviewed Jul 17, 2017

View reviewed changes

Trevan Richins added 2 commits July 18, 2017 10:36

Handle date histogram scaling for table vis and avg_buckets metric

5143091

Only scale when there is a scaling factor

aab24ec

Add comments and unify scaling logic

c51fc30

ppisljar approved these changes Jul 19, 2017

View reviewed changes

ppisljar merged commit 396b073 into elastic:master Jul 19, 2017

thomasneirynck added the v6.0.0 label Jul 19, 2017

jimgoodwin added the v6.0.0-rc1 label Sep 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle date histogram scaling for table vis and avg_buckets metric #11929

Handle date histogram scaling for table vis and avg_buckets metric #11929

trevan commented May 19, 2017

elasticmachine commented May 19, 2017

trevan commented May 19, 2017

trevan commented May 19, 2017

thomasneirynck commented May 19, 2017

trevan commented May 20, 2017

thomasneirynck commented May 23, 2017

ppisljar commented Jun 6, 2017

ppisljar commented Jun 6, 2017

ppisljar commented Jun 6, 2017

ppisljar commented Jun 6, 2017

trevan commented Jun 6, 2017

ppisljar commented Jun 7, 2017

thomasneirynck Jun 16, 2017

trevan Jun 16, 2017

thomasneirynck Jun 16, 2017

thomasneirynck left a comment

thomasneirynck Jun 16, 2017

trevan Jun 16, 2017

thomasneirynck Jul 17, 2017

trevan Jul 18, 2017

thomasneirynck Jun 16, 2017 •

edited

Loading

trevan commented Jun 16, 2017

thomasneirynck left a comment

thomasneirynck Jul 17, 2017

trevan commented Jul 18, 2017

ppisljar commented Jul 19, 2017

ppisljar left a comment

ppisljar commented Jul 19, 2017

trevan commented Jul 19, 2017

Handle date histogram scaling for table vis and avg_buckets metric #11929

Handle date histogram scaling for table vis and avg_buckets metric #11929

Conversation

trevan commented May 19, 2017

elasticmachine commented May 19, 2017

trevan commented May 19, 2017

trevan commented May 19, 2017

thomasneirynck commented May 19, 2017

trevan commented May 20, 2017

thomasneirynck commented May 23, 2017

ppisljar commented Jun 6, 2017

ppisljar commented Jun 6, 2017

ppisljar commented Jun 6, 2017

ppisljar commented Jun 6, 2017

trevan commented Jun 6, 2017

ppisljar commented Jun 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomasneirynck left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomasneirynck Jun 16, 2017 • edited Loading

Choose a reason for hiding this comment

trevan commented Jun 16, 2017

thomasneirynck left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

trevan commented Jul 18, 2017

ppisljar commented Jul 19, 2017

ppisljar left a comment

Choose a reason for hiding this comment

ppisljar commented Jul 19, 2017

trevan commented Jul 19, 2017

thomasneirynck Jun 16, 2017 •

edited

Loading