[WIP] Add stats to tasks #323

yosiat · 2016-03-12T17:49:31Z

See issue - #321

Currently I implemented the input nodes, I am waiting for feedback about this and a suggestion of how to implement the metrics fetching for the outputs.

nathanielc · 2016-03-14T16:05:46Z

@yosiat This is great work thanks!

A few administrative things: Can you sing the CLA? and add an entry to the CHANGELOG about this change? Thanks

I like the structure that you have added but I think it would be easy and valuable to expose more stats.

First what stats are you after?

I see two paths we could go down:

Provide just summary stats about the task. Similar to the throughput stat that already exists/
Expose all stats from all nodes so that the DOT data does not need to be parsed.

We can even do both.. And I think we should. I'll comment directly in the code where I think we could update what you have already.

nathanielc · 2016-03-14T16:08:46Z

task.go

+	// Input nodes and their emitted points
+	// where input nodes - are batch/stream node
+	// for example: stream1 -> 12
+	Inputs map[string]int64


We should update the struct to have both summary stats for the task and a map of all stats for the all nodes in the task.

maybe a structure like

type ExecutionStats struct { TaskStats map[string]interface{} // stats name -> value i.e throughput -> 4.3 points /sec NodeStats map[string]map[string]interface{} // node name -> stat name -> value i.e influxdb_out -> point_written -> 1000 }

Then use the statsMap that each node has to populate the node stats maps.

Thoughts?

yosiat · 2016-03-14T17:41:36Z

@nathanielc Thanks!

I already signed the CLA (https://influxdb.com/community/cla.html) and I will add an entry to the change log once I will finish the output nodes as well.

About the structure of stats I will change it but before, I want to clarify the reason for this API - I want to expose general stats - "stats for glance" on all tasks, I don't want to extract all the data from the dot (for example: how much each node processed and how much time it took).

I think it should contain:

Task stats
- Throughput of the task - currently the throughput is all Kapacitor throughput I want to know how point s the whole pipeline gets (only those whom filtered by the measurement, db & rp)
- Filtered points count - how much points we filtered
- Total points we processed
Input and Output stats - this the "glance" data of the nodes - I want to know how much points I processed and how much I outputed (alerts/written to influx) (and this why I did direct mapping from node name to it's output value)

But in the overall sense I agree this structure is more acceptable:

type ExecutionStats struct {
    TaskStats map[string]interface{}
    NodeStats struct {
     Input map[string]map[string]interface{}
     Output map[string]map[string]interface{}
    }
}

I changed NodeStats to be struct that will contain only the Input and Output to note that "for all nodes stats - there is a dot graph, for general data (like input & output), this is the place.

About the values - why this is interface{} and not floats, I think those metrics should be raw value like "789" and not "789 points/s" - the resolution should be noted on the stat name for example "PointsPerSecond" => 789.

Aside the structure, I don't know how to implement the output nodes stats (as I wrote in the issue), currently I think to fetch all output nodes by predicate: node that have parents and doesn't have child nodes. but I don't know how to expose the output stats -

should it be a function that will expose the "statsMap" or map[string]float ?
and where this function will be? on "Node" or add another interface for "OutputNodes" ?

yosiat · 2016-03-16T18:59:55Z

@nathanielc any comment? do you want me to clarify my answer better? currently I am waiting to your response before proceeding on.

nathanielc · 2016-03-16T19:26:44Z

@yosiat Sorry for the delay.

I like the plan, stats at a glance is a good approach. I am a little confused about the Input, Output stats. Are input stats just counts for how many points a node has received? If so why a map its just a single count per node? Same question for output stats.

About the values - why this is interface{} and not floats, I think those metrics should be raw value like "789" and not "789 points/s" - the resolution should be noted on the stat name for example "PointsPerSecond" => 789.

Agreed, the interface is so you can use int64 or float64 but should always be one or the other.

yosiat · 2016-03-16T19:38:31Z

@nathanielc it's ok :)
The "inputs" and "outputs" are plural because I want to handle edge cases where user might write something like this:

var a = stream.from('a')
var b= stream.from('b')

a.join(b)
  // do some alerting logic

And output might be creating an alert and writing to influx

What do you think about the output nodes implementation? what do you prefer - specific function or exposing the statsMap?

yosiat · 2016-03-18T14:25:20Z

Changed the ExecutionStats structure, waiting to @nathanielc to answer on how to implement the output nodes.

nathanielc · 2016-03-18T15:33:13Z

@yosiat I think simpler is better here, since we are doing stats at a glance.

// Stats about the execution of a task
type ExecutionStats struct {
    // Stats global to the task
    // throughput etc...
    TaskStats map[string]interface{}
    // Specific node stats,
    // Emitted, Collected counts plus anything in statsMap.
    NodeStats map[string]interface{}
}

NodeStats will be populated by whatever is in statsMap using statsMap.Do and the add in collected and emitted stats using the node.collectedCount and node.emittedCount stats.

Populating TaskStats will be a bit more work since it doesn't already have a framework in place. We should probably add a statsMap to ExecutingTask too if it make sense. Otherwise we can populate TaskStats by aggregating other stats from nodes.

This should get all the stats we were looking for.

yosiat · 2016-03-18T15:35:15Z

@nathanielc So you want ExecutionStats to contain all nodes? If so, what is the difference between this and a Dot?

nathanielc · 2016-03-18T15:39:18Z

@yosiat Its exposed via HTTP in JSON so you do not have to parse DOT since its not a good method for conveying numerical data.

yosiat · 2016-03-18T15:49:51Z

@nathanielc ok, I am doing this right now.
Just to make sure, you wrote:

NodeStats map[string]interface{}

instead of:

NodeStats map[string]map[string]interface{}

Was it by mistake?

nathanielc · 2016-03-18T15:50:48Z

@yosiat Yes is should be NodeStats map[string][string]interface{}

yosiat · 2016-03-18T16:09:08Z

@nathanielc I finished.

Here is the final result:

{
    Name: "cpu_alert",
    Type: 0,
    DBRPs: [{
        db: "kapacitor_example",
        rp: "default"
    }],
    Enabled: true,
    Executing: true,
    ExecutionStats: {
        TaskStats: {
            throuput: 0
        },
        NodeStats: {
            alert2: {
                alerts_triggered: "0",
                avg_exec_time_ns: "0",
                collected: 0,
                emitted: 0
            },
            srcstream0: {
                avg_exec_time_ns: "0",
                collected: 0,
                emitted: 0
            },
            stream1: {
                avg_exec_time_ns: "0",
                collected: 0,
                emitted: 0
            }
        }
    }
}

Do you know how change statsMap representation to be numbers and not strings?

nathanielc · 2016-03-18T16:13:06Z

Do you know how change statsMap representation to be numbers and not strings?

Yes , have a look at https://github.com/influxdata/kapacitor/blob/master/global_stats.go#L132
Kapacitor has created a bunch of special expvar types just for that purpose.

Also quick typo fix throuput -> throughput. Looks great.

yosiat · 2016-03-18T16:22:33Z

@nathanielc done.
Do you want me to squash the commits?

nathanielc · 2016-03-18T16:24:42Z

@yosiat yes, please.

yosiat · 2016-03-18T16:58:33Z

@nathanielc done 👍

nathanielc · 2016-03-18T17:03:50Z

node.go

@@ -246,6 +250,30 @@ func (n *node) collectedCount() (count int64) {
 	return
 }

+func (n *node) emittedCount() (count int64) {
+	for _, out := range n.outs {
+		count += out.emittedCount()


This needs to read count += out.collectedCount() see comment in node.collectedCount

Can you explain? I don't see any comment on node.collectedCount

Hmm, the comment is gone, what it said was:

node collected count is the sum of emitted counts of parent edges

The opposite is true for node emitted count:

node emitted count is the sum of collected counts of children edges

Edges can buffer so we care about the actual points this node has collected or emitted.

nathanielc · 2016-03-18T17:50:36Z

node.go

@@ -246,6 +250,30 @@ func (n *node) collectedCount() (count int64) {
 	return
 }

+func (n *node) emittedCount() (count int64) {
+	for _, out := range n.outs {
+		count += out.collectedCount()


Could you add those comments in so its not confusing for the next guy? Thanks

@nathanielc Just to make sure:
node collected count is the sum of emitted counts of parent edges above collectedCount
node emitted count is the sum of collected counts of children edges above emittedCount

?

Yep
On Mar 18, 2016 11:55 AM, "Yosi Attias" notifications@github.com wrote:

In node.go
#323 (comment):

@@ -246,6 +250,30 @@ func (n *node) collectedCount() (count int64) {
return
}

+func (n *node) emittedCount() (count int64) {

for _, out := range n.outs {

count += out.collectedCount()

Just to make sure:
node collected count is the sum of emitted counts of parent edges above
collectedCount
node emitted count is the sum of collected counts of children edges above
emittedCount

?

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/influxdata/kapacitor/pull/323/files/19d0ae1ab9e484ab4420b8f8901013506cb50800#r56696878

nathanielc · 2016-03-18T19:55:29Z

@yosiat Thanks! merging...

Add stats to executing tasks

yosiat · 2016-03-18T19:58:42Z

@nathanielc I forgot to add a change log entry.

nathanielc · 2016-03-18T20:01:22Z

I added no worries

On Fri, Mar 18, 2016 at 1:58 PM, Yosi Attias notifications@github.com
wrote:

@nathanielc https://github.com/nathanielc I forgot to add a change log
entry.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#323 (comment)

Nathaniel Cook
Kapacitor Lead
▼▴

InfluxData.com http://influxdata.com/

Twitter http://www.twitter.com/nathanielvcook

yosiat changed the title ~~Add stats to tasks~~ [WIP] Add stats to tasks Mar 12, 2016

yosiat mentioned this pull request Mar 12, 2016

All tasks stats #321

Closed

nathanielc reviewed Mar 14, 2016
View reviewed changes

yosiat force-pushed the tasks-metrics branch from 10939b0 to 8fd727f Compare March 18, 2016 14:03

yosiat force-pushed the tasks-metrics branch from 4e3c4de to fd30eee Compare March 18, 2016 16:30

nathanielc reviewed Mar 18, 2016
View reviewed changes

yosiat force-pushed the tasks-metrics branch 2 times, most recently from ee88d5b to 19d0ae1 Compare March 18, 2016 17:48

nathanielc reviewed Mar 18, 2016
View reviewed changes

Adding ExecutionStats to task for finding node stats and task stats.

4a31c1b

yosiat force-pushed the tasks-metrics branch from 19d0ae1 to 4a31c1b Compare March 18, 2016 18:32

nathanielc pushed a commit that referenced this pull request Mar 18, 2016

Merge pull request #323 from yosiat/tasks-metrics

83906b4

Add stats to executing tasks

nathanielc merged commit 83906b4 into influxdata:master Mar 18, 2016

yosiat deleted the tasks-metrics branch March 18, 2016 19:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add stats to tasks #323

[WIP] Add stats to tasks #323

yosiat commented Mar 12, 2016

nathanielc commented Mar 14, 2016

nathanielc Mar 14, 2016

yosiat commented Mar 14, 2016

yosiat commented Mar 16, 2016

nathanielc commented Mar 16, 2016

yosiat commented Mar 16, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc Mar 18, 2016

yosiat Mar 18, 2016

nathanielc Mar 18, 2016

yosiat Mar 18, 2016

nathanielc Mar 18, 2016

yosiat Mar 18, 2016

nathanielc Mar 18, 2016

yosiat Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

[WIP] Add stats to tasks #323

[WIP] Add stats to tasks #323

Conversation

yosiat commented Mar 12, 2016

nathanielc commented Mar 14, 2016

Choose a reason for hiding this comment

yosiat commented Mar 14, 2016

yosiat commented Mar 16, 2016

nathanielc commented Mar 16, 2016

yosiat commented Mar 16, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nathanielc commented Mar 18, 2016

yosiat commented Mar 18, 2016

nathanielc commented Mar 18, 2016