-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New junos messages for bgp, ospf, and configuration #168
New junos messages for bgp, ospf, and configuration #168
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's awesome @lampwins!
messages: | ||
# 'error' should be unique and vendor agnostic. Currently we are using the JUNOS syslog message name as the canonical name. | ||
# This may change if we are able to find a more well defined naming system. | ||
- error: BPDU_BLOCK_INTERFACE_DISSABLED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small typo here: BPDU_BLOCK_INTERFACE_DISABLED
@@ -0,0 +1 @@ | |||
<149>Jun 21 14:03:12 vmx01 rpd[2902]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer 1.1.1.1 (External AS 2222) changed state from Active to Connect (event AnEvent) (instance master) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good if you could check the actual value the device sends for this PRI: this will be reflected in the expected output from yang.json
. Now, the documentation is auto-generated from these mock files, as examples, e.g., http://napalm-logs.readthedocs.io/en/latest/messages/NTP_SERVER_UNREACHABLE.html, so it would be good to have the facility
and severity
in the examples as close as possible to the real values, so the users know what to expect.
@@ -0,0 +1,15 @@ | |||
# This error tag corresponds to syslog messages notifying that the configured | |||
# interface has been dissabled due to a bpdu block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disabled
here again
@@ -0,0 +1,15 @@ | |||
# This error tag corresponds to syslog messages notifying that the configured |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the filename of the profile is the value of the error
, so it's easier to check where is a feature defined; however, that's not mandatory.
Solid work here @lampwins, thank you! Besides my suggestions above, I'd recommend you to double check your git settings and amend the commit author (the email address in particular), so it will link to your github profile. |
2a3d0fc
to
560090a
Compare
I have changed the filenames in the profile and fixed the PRI numbers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more minor changes please, and some food for thoughts :-)
area: (\d+\.\d+\.\d+\.\d+) | ||
reason_message: ([\w ]+) | ||
line: 'OSPF neighbor {neighbor} (realm ospf-v2 {interface} area {area}) state changed from Full to Down due to KillNbr (event reason: {reason_message})' | ||
model: openconfig-network-instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this previously: I think we should reference the openconfig-opsf
, as it's a standalone model (although it's just based on the network instance one). Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, except for why openconfig does this :)
"1.1.1.1": { | ||
"state": { | ||
"peer_as": "2222", | ||
"session-state-change-event": "AnEvent", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a genuine event message (AnEvent
)? If not, could you please reference a real example (same documentation reason).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, just me being lazy, will fix. Also this relates to my comments on the ospf events.
@@ -0,0 +1,33 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might revisit this structure later (I can't think of a better structure currently though), but I fear it won't be compatible with other platforms. That's just a FYI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this will not last in the long run. Ideally I think it should map to actual openconfig models depending on the error. I can't even begin to imagine what that would look like in a mapping definition though (outside of some complex python version).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion to follow under #175
"neighbor": { | ||
"1.1.1.1": { | ||
"state": { | ||
"adjacency-state-change-reason-message": "dead timer", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering how diverse is this message and if we could define a typeref as with the others, e.g., DEAD_TIMER
(so it will be standard across other platforms). We can leave this as an open discussion for later, but it would be nice if you could build up a list of the messages you've seen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just me being lazy with the test and I should have fixed it in the last commit. The reason message is tough is because there are many different messages that relate to the INACTIVE_TIMER
event type. For example, whether of not BFD is in play.
This is the same problem in the BGP state changes. As you can see there are already a number mapping required to cover all state transitions in the BGP state machine. IMO it would be very unwieldy to then throw in all the human event messages because there are cases where several messages map to one state change.
I think it would be more feasible down the road if we could define these messages separate from the log mappings. These would allow us to map several different vender messages for an event to some model value while not having to keep up with tens/hundreds of log mappings for a single log type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. My opinion is that we should try to map them as much as possible (if they are not completely random and there's a decent amount of mappings needed).
I think it would be more feasible down the road if we could define these messages separate from the log mappings.
Good suggestion - after this is merged, I will open an issues so we can discuss further and see in time what's the best way to approach this, after we'll have more examples and feedback from the users / developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss this further and build up a list of possible messages: #176
1 similar comment
Any idea what's up with the py2.7 travis build? Just curious as I have also been having to exclude it locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
Would you have any feedback @loverend?
@lampwins I don't know either why's py27 failing - I will have a closer somewhere next week, certainly worth investigating. I suspect there was a recent change in one of the underlying dependencies; tox sometimes behaves very weird too, so there can be several possibilities. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great, just a couple of suggestions.
neighbor: ([\w\d:\.]+) | ||
interface: ([\w\-\/\.\d]+) | ||
area: (\d+\.\d+\.\d+\.\d+) | ||
reason_message: ([\w ]+) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not use special characters in the key name as we escape all special characters. This will cause a key error when parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@loverend Could you elaborate on this? Do you mean in the regex's or reason_message
itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean reason_message
, it should be reasonMessage
bgp//neighbors//neighbor//{peer}//state//peer_as: asn | ||
bgp//neighbors//neighbor//{peer}//state//session-state-change-event: event | ||
static: | ||
bgp//neighbors//neighbor//{peer}//state//session-state: IDLE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repeat entries can be condensed into one entry by matching the old and new states and then converting the values into the appropriate openconfig names. You can do this by casting to a function. See https://github.com/napalm-automation/napalm-logs/blob/master/napalm_logs/config/junos/SYSTEM_ALARM.yml#L13 for an example.
In this case you could add a function to convert any changes like OpenSent to OPEN_SENT and others you could just return as uppercase, i.e:
def bgp_state_convert(state):
state_dict = {'OpenSent': 'OPEN_SENT',
'OpenConfirm': 'OPEN_CONFIRM'}
return state_dict.get(state, state.upper())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is better than my current approach but IMO this is going to get unwieldy fast once you start to throw other vendors. For example, Juniper calls it OpenSent
while cisco calls it OPEN SENT
. Obviously these mappings have to be defined somewhere but I don't think hiding them in a function inside the utils package is the best idea in the long run.
Like I said, this is better than what I have for now, but I think it is worth further discussion, along with the vendor event message mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably have a functionality directly into the YAML profiles and help with this kind of 1:1 mappings, e.g.,
mapping:
replace:
state:
OpenSent: OPEN_SENT
OpenConfirm: OPEN_CONFIRM
variables:
bgp//neighbors//neighbor//{peer}//state//peer_as: asn
bgp//neighbors//neighbor//{peer}//state//session-state-change-event: event
bgp//neighbors//neighbor//{peer}//state//session-state: state
This way we also avoid any function utility, but the mapping is specific to a particular platform only.
Any thoughts on this @loverend @lampwins?
Anyway, for the moment I would say to get this PR in, and investigate this sort of optimisations later (before they'll grow more), but it's very good to have this discussion right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mirceaulinic I think this is headed in the right direction. Ultimately will take some playing around with but I like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe even something like:
preprocess:
state:
- replace:
OpenSent: OPEN_SENT
OpenConfirm: OPEN_CONFIRM
- upper
- lower
- upper
mapping:
variables:
bgp//neighbors//neighbor//{peer}//state//peer_as: asn
bgp//neighbors//neighbor//{peer}//state//session-state-change-event: event
bgp//neighbors//neighbor//{peer}//state//session-state: state
Just in case we'd even need more extensive preprocessing to some fields (that's maybe just not the best example, but I think you see what I mean: to a certain field you can apply several consecutive operations).
This covers bgp neighbor state changes, ospf state changes, and configuration commit and rollback life cycle.