Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to count alert event? #313

Closed
MartinSbs opened this issue Mar 11, 2016 · 2 comments
Closed

How to count alert event? #313

MartinSbs opened this issue Mar 11, 2016 · 2 comments

Comments

@MartinSbs
Copy link

Good Morning,
I need some help with tick script.
I generate critical and warning alerts and I want to count how much warning and alert I get for each element I have.

{"id":"cpu:cpu=cpu2,","message":"cpu=cpu2, status = OK , value = 94.81037924134073 "}
{"id":"cpu:cpu=cpu3,","message":"cpu=cpu3, status = CRITICAL , value = 8.40000000020722 "}
{"id":"cpu:cpu=cpu-total,","message":"cpu=cpu-total, status = OK , value = 75.0125439038205 "}
{"id":"cpu:cpu=cpu3,","message":"cpu=cpu3, status = OK , value = 92.00000000006985 "}

I want to count for each cpu when it's critical and when it's warning. How often it appears.

First I tried to use .message to generate json message but I can't use it because of the quote in .message(' ......')

What I want generate with tick script :

[root@localhostmaintain]# cat testJson.json | jq '.message '
{
"value": "30.522088353723465",
"status": "CRITICAL",
"cpu": "cpu=cpu2"
}
{
"value": " 57.17131474106242 ",
"status": "WARNING",
"cpu": " cpu=cpu1"
}
{
"value": "65.73286643331711",
"status": "WARNING",
"cpu": "cpu=cpu-total"
}

and What I generate :

[root@localhostmaintain]# cat alerts.json | jq '.message '
"{"cpu" : "cpu=cpu0," , "state" : "CRITICAL" , "value" : "43.42629482066862" }"
"{"cpu" : "cpu=cpu3," , "state" : "CRITICAL" , "value" : "45.40000000037253" }"
"{"cpu" : "cpu=cpu-total," , "state" : "WARNING" , "value" : "65.26946107789301" }"
"{"cpu" : "cpu=cpu0," , "state" : "OK" , "value" : "82.76553106188932" }"
"{"cpu" : "cpu=cpu3," , "state" : "WARNING" , "value" : "65.73146292595105" }"
"{"cpu" : "cpu=cpu-total," , "state" : "OK" , "value" : "78.8972431074465" }"
"{"cpu" : "cpu=cpu3," , "state" : "OK" , "value" : "97.1999999997206" }"
"{"cpu" : "cpu=cpu2," , "state" : "WARNING" , "value" : "69.1382765528853" }"
"{"cpu" : "cpu=cpu2," , "state" : "OK" , "value" : "94.01197604814769" }"
"{"cpu" : "cpu=cpu0," , "state" : "WARNING" , "value" : "63.27345309395036" }"
"{"cpu" : "cpu=cpu0," , "state" : "OK" , "value" : "96.16935483865645" }"
"{"cpu" : "cpu=cpu1," , "state" : "WARNING" , "value" : "68.80000000004657" }"
"{"cpu" : "cpu=cpu1," , "state" : "OK" , "value" : "95.59118236472362" }"

My alerts.tick script:

stream
    .from().measurement('cpu')
        .groupBy('cpu')

    .alert()
        //.id('{{.Group}}')
        .message('{"cpu" : "{{ .Group }}" , "state" : "{{ .Level }}" , "value" : "{{ index .Fields "usage_idle"}}" }')

        // Compare values to running mean and standard deviation
        .warn(lambda: ("usage_idle") < 70)
        .crit(lambda: ("usage_idle") < 50)

        .log('/home/maintain/alerts.log')
        .log('/home/maintain/alerts.json')

         // Send alerts to slack
        .slack()

Is it possible directly with the .message or script tick or I need to use another element to count my alert?
Thanks

@nathanielc
Copy link
Contributor

@MartinSbs

Your message looks fine, its just that the value of the message is a string and so you will need to interpret it as json as well. Since I see you are using jq let me try and show you what I mean. cat alerts.json | jq '.message' -r | jq . the -r flag tells jq to use the raw output and then you can pass the message json into another jq for more processing.

Once issue #93 is implemented you will be able to chain nodes off the alert node directly to perform these operations within TICKscript itself.

Until #93 is implemented you can do this currently with a little bit of work.

var warn_threshold = 70
var crit_threshold = 50

var data = stream
    .from().measurement('cpu')
        .groupBy('cpu')

// Do normal alerting
    data.alert()
        .warn(lambda: "usage_idle" < warn_threshold)
        .crit(lambda: "usage_idle" < crit_threshold)
         // Log alerts
        .log('/home/maintain/alerts.log')
         // Send alerts to slack
        .slack()

// Count warn events
var warn_count = data.where(lambda: "usage_idle" < warn_threshold).mapReduce(influxql.count('usage_idle'))
var crit_count = data.where(lambda: "usage_idle" < crit_threshold).mapReduce(influxql.count('usage_idle'))

// Do what ever you want with the counts.

Once #93 is finished it would look some like this:

var warn_threshold = 70
var crit_threshold = 50

var data = stream
    .from().measurement('cpu')
        .groupBy('cpu')
    .alert()
        .warn(lambda: "usage_idle" < warn_threshold)
        .crit(lambda: "usage_idle" < crit_threshold)
         // Log alerts
        .log('/home/maintain/alerts.log')
         // Send alerts to slack
        .slack()
    .groupBy('level')
    .mapReduce(influxql.count('value'))
    // do what ever you want with the counts grouped by level

@MartinSbs
Copy link
Author

Thanks it's perfect!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants