-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snmp Trap service #929
Snmp Trap service #929
Conversation
d0c3519
to
49b2a5a
Compare
@nathanielc ready for first review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the PR is organized well, I just have a few questions about the workflow and structure of traps. See my comments in-line.
// stream | ||
// |alert() | ||
// .snmp() | ||
// .trap('1.3.6.1.2.1.1.7', 'i', {{ Index field value }}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example isn't valid template syntax. Can you update it to be accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed !
|
||
// List of email recipients. | ||
// tick:ignore | ||
TrapList [][]interface{} `tick:"Trap"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What information does a trap need?
Based on the configuration and the docs, it looks like traps have a these fields:
- Target IP
- Target Port
- OID
- Version
- Community
Anything else?
Maybe changing the syntax up a bit would make it more clear:
|alert()
.snmpTrap()
.ip('x.x.x.x')
.port(9863)
.oid('1.3.6.1.2.1.1.7')
.version('v3')
.community('asdf')
.snmpTrap()
.ip('x.x.x.x')
.port(2656)
.oid('1.3.6.1.2.1.1.7')
.version('v2')
.snmpTrap()
// defaults to config values if not specified
.oid('1.3.6.1.2.1.1.7')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See last comment
// | ||
// Passing addresses to the `email` property directly or using the `email.to` property is the same. | ||
// tick:property | ||
func (h *SnmpHandler) Trap(trap ...interface{}) *SnmpHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment needs to be updated to be SNMP related, its still the email comments.
Version string `toml:"version"` | ||
// Whether all alerts should automatically use stateChangesOnly mode. | ||
// Only applies if global is also set. | ||
StateChangesOnly bool `toml:"state-changes-only"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add the Global
option like other handlers have? Or does that not work well in the context of SNMP alerts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this could be global. You need to add data to your snmp trap
|
||
var varBinds snmpgo.VarBinds | ||
for _, trap := range traps { | ||
fmt.Printf("\n\n\nREERRRR %s", trap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this debug logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done !
func (s *Service) Alert(traps [][]interface{}, level kapacitor.AlertLevel) error { | ||
address := s.targetIp + strconv.Itoa(s.targetPort) | ||
snmp, err := snmpgo.NewSNMP(snmpgo.SNMPArguments{ | ||
Version: snmpgo.V2c, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a version option in the config if it is being hardcoded here?
@@ -82,6 +82,10 @@ type TaskMaster struct { | |||
StateChangesOnly() bool | |||
Alert(channel, message string, level AlertLevel) error | |||
} | |||
SnmpService interface { | |||
StateChangesOnly() bool | |||
Alert(trap [][]interface{}, level AlertLevel) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do snmp traps support a message
concept? Meaning can they contain a short string of some short or do they only support the alert level? Also how do you know the context of the alert? For example, say you are triggering alerts on CPU usage per host, when the NMS receives the alert how does it know which host triggered the alert and that it is a CPU alert?
I am only marginally familiar with SNMP so if I am missing something obvious please enlighten me. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See last comment
ee01b91
to
4a0e7b0
Compare
@nathanielc With all your comments, I understood that this service was not clear. So I renamed it to
The NMS will receive the trap and process it as ONE message with several fields. One field for one oid. I think those new names is more clear than the previous one. |
@nathanielc any news on that one ? |
OK so |
@nathanielc I didn't do that because the community is like a password, so I thought it's better to not expose it in TICK scripts |
I found a small issue in the trap sent, I will send an update today |
@nathanielc Issue fix with last commit: 28f0c16 |
@nathanielc this PR is ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests in the integrations
package?
You will find tests in there for the other alert handlers.
How can I test this locally? Is there a simple SNMP server I can stand up to test this works as expected?
// .data('1.3.6.1.4.1..1.7', 's', '{{ index .Fields "used_percent" }}' ) | ||
// | ||
// tick:property | ||
func (h *SnmpTrapHandler) Data(data ...interface{}) *SnmpTrapHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these arguments always strings? And are there always three of them?
If so should the signature change to Data(oid, rawType, value string)
? Not sure what the names should be...
return errors.New("Version 3 not supported yet") | ||
default: | ||
return errors.New("Bad snmp version should be: '1', '2c' or '3'") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't add value to add a version
config option and then only support a single version. How hard is it to support the other versions? From what I saw it looked like the snmp package has support for various versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, I can't test snmp v3 :/
// | ||
// tick:property | ||
func (h *SnmpTrapHandler) Data(data ...interface{}) *SnmpTrapHandler { | ||
// TODO check element validity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add a validate
method to the SnmpTrapHandler, and call if from the AlertNode validate method.
TrapOid string | ||
// List of trap data. | ||
// tick:ignore | ||
DataList [][]interface{} `tick:"Data"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the structure of the data? Is there a reason we need to use an interface{}
or can this be converted to a list of structs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello ! in fact, I'm just trying to get [][]string but I don't know how to get that
from alert.go
line 250
tpattrtmpl, err := text.New("trap").Parse(attr.(string))
How can I get string from this stuff ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nathanielc in fact, I think that the struct could be [][3]string
Example:
|alert()
.snmpTrap('MAINOID')
.data('SUBOID1', 's', '{{ .Message }}')
.data('SUBOID2', 'i', '{{if eq .Level "OK"}}0{{else}}{{if eq .Level "WARNING"}}3{{end}}{{end}}')
.data('SubOid3', 's', '{{ .Name }}')
Hello @nathanielc ! |
29306aa
to
a3c9193
Compare
@nathanielc I just added tests |
@titilambert Thanks! I am rounding the bend on a large refactor for the alert handlers system, ( I hope to have the majority of the work done by tomorrow 🙏 ). As a result I hope to take a look at this next week. It will take some work to refactor it a bit for the changes I am making , if you have time next week as well I can point you in the right direction, otherwise I can make the changes later. |
@nathanielc thanks for keeping a priority in this PR. Lets us know when the refactoring will be completed asap. we would absolutely need this in 1.1.2 if possible since we are also waiting for some bugs fixes from this release. |
@djsly @titilambert The Alert system work is basically complete at this point. I am going to start looking into this PR again. Can you give me push access to your fork so that I can make changes? See https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/ |
@nathanielc hello, Thanks for your reply ! |
@titilambert Thanks! Looks like it is working, I was able to rebase and push to your snmp branch. |
@titilambert @djsly I have most things migrated over to the new alerting system but I have a few questions for things that did not make sense to me as I dug around in the code.
For now I removed the logic around configuring the version since only one version is supported. When we need/add support for other versions we can add it back in. Once I am out of the weeds I'll push up my changes so you can take a look. |
@titilambert @djsly OK I have tests passing most of the time ;), but the bad news is the The issues I am running into only effect testing so they are not super critical but I we need to have stable tests. I can consistently get the TrapServer to panic when generating a request ID. Also the TrapServer type doesn't have a graceful shutdown (granted that is difficult to do with UDP) but the shutdown process is racy and causes data races like:
At this point we have a few options:
Again since I know you played around with other packages too I would like your input before moving forward. Thanks! |
I took a look around and the I also protected access to the snmp client since it is not designed for concurrent use either. At this point tests are passing locally without races. |
@nathanielc thanks for this work on the PR! Would there be anything you would want @titilambert or myself to look at to complete this merge process ? |
There are two things I would like you to look at before I merge:
And one question:
EDIT: I read back through all the comments on this PR and found:
So I'll make the change to treat it like a password. Other than that the PR to the snmpgo repo was merged over the weekend so we have no blockers. |
@@ -210,6 +210,17 @@ default-retention-policy = "" | |||
# meaning alerts will only be sent if the alert state changes. | |||
state-changes-only = false | |||
|
|||
[snmp] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[snmptrap]
// Example: | ||
// [snmptrap] | ||
// enabled = true | ||
// host = "127.0.0.1:9162" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addr = "127.0.0.1:9162"
// Example: | ||
// stream | ||
// |alert() | ||
// .snmpTrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You HAVE to put an OID in the snmpTrap Node
@nathanielc Big Thanks ! |
@titilambert Thanks for the review! I'll merge on green |
…com/k-sone/snmpgo subrepo: subdir: "vendor/github.com/k-sone/snmpgo" merged: "1cd4bf6f9" upstream: origin: "https://github.com/k-sone/snmpgo.git" branch: "master" commit: "1cd4bf6f9" git-subrepo: version: "0.3.0" origin: "???" commit: "???"
Required for all non-trivial PRs