-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DC] Meraki Device Connector #290
[DC] Meraki Device Connector #290
Conversation
Replacing underscore with actual description Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
Replacing underscore with actual description Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
…Alert into aku/meraki_connector
This reverts commit 26d1589.
src/connectors/meraki.py
Outdated
|
||
|
||
def connect(connection_name, options): | ||
landing_table_client = f'data.meraki_{connection_name}_connection_client' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The table names have to end in _CONNECTION or they're not considered a connection table (i.e. the connection runner won't pick them up and call the ingest method on the module for the table). So you'd want these to be f'data.meraki_{connection_name}_client_connection'
or '_device_connection'.
src/connectors/meraki.py
Outdated
return None | ||
|
||
|
||
def ingest(table_name, options): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So your connect method creates two tables, both of which I think you want to be connection tables. But the ingest method populates both of them every time it gets called, which means that you're going to be duplicating data in your tables; you'll populate both tables when the connection runner finds the client_connection table, and then you'll populate both tables again a second later when the connection runner finds the device_connection table.
For each call to the ingest method, it should populate one table: the table which ingest is called for.
src/connectors/meraki.py
Outdated
except HTTPError as http_err: | ||
log.error(f"Error GET: url={url}") | ||
log.error(f"HTTP error occurred: {http_err}") | ||
raise http_err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-raise with just raise
, that way the stack trace is maintained rather than being reset to this point
src/connectors/meraki.py
Outdated
client.get('mdnsName','None'), | ||
client.get('dhcpHostname','None'), | ||
client.get('ip','None'), | ||
parse_number(client.get('vlan','None')), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what the client value is, but agreed this looks buggy. right now it does:
client.get('vlan', 'None') or None
are you sure this is what you want?
src/connectors/meraki.py
Outdated
client.get('ip','None'), | ||
parse_number(client.get('vlan','None')), | ||
client.get('switchport','None'), | ||
parse_number(client.get('usage','None').get('sent','None')), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't .get
on a string (i.e. 'None'
), so this will raise an exception if 'usage' not in client
src/connectors/meraki.py
Outdated
@@ -0,0 +1,181 @@ | |||
"""Meraki |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this name a little more descriptive -- there are a lot of things we could potentially collect from Meraki, and the titles should be distinct.
Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
log.error(e) | ||
continue | ||
|
||
db.insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I wasn't more clear on this. could you please replace this db.insert expression with --
db.insert(
landing_table,
values=[(
timestamp,
client,
client.get('id'),
client.get('mac'),
client.get('description'),
client.get('mdnsName'),
client.get('dhcpHostname'),
client.get('ip'),
client.get('switchport'),
client.get('vlan') or None,
client.get('usage', {}).get('sent') or None,
client.get('usage', {}).get('recv') or None,
serial_number,
) for client in clients],
select=db.derive_insert_select(LANDING_TABLE_COLUMNS_CLIENT),
columns=db.derive_insert_columns(LANDING_TABLE_COLUMNS_CLIENT)
)
I think it will do the same thing but more succinctly.
to clarify x or y
will return y
if x
if "falsy", i.e. if bool(x) is False
, and dict
's .get
returns None as the default value if the key is not present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, are you sure vlan
needs to be treated differently than ip
or switchport
? this seems off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes client.get('vlan')
will return ''
instead of None
, which cannot be inserted into the database since the VLAN
column requires a number. I like your suggested change though! I just can't treat vlan
the same way as ip
because that '' value is invalid in our case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, interesting. If you're sure that the others never return ''
then lgtm. Adding a comment like --
client.get('vlan') or None, # vlan sometimes set to ''
might help to future folks, as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen that occur with usage:recv or usage:sent, but I think it'll be best if we used the same approach for those instances too. Thank you for this feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
src/connectors/meraki_devices.py
Outdated
values=[( | ||
timestamp, | ||
device, | ||
device.get('serial',None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove all of these ,None
, as they're vestigial (None
is the default value when key is not present), and per pep8, the comma in param lists needs to be followed by a space or newline.
src/connectors/meraki_devices.py
Outdated
comment = yaml_dump(module='meraki_devices', **options) | ||
|
||
db.create_table(name=landing_table_client, | ||
cols=LANDING_TABLE_COLUMNS_CLIENT, comment=comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a newline before "comment" please to adhere to PEP8
@edulop91 could you give me push access to this repo so I can help push minor cleanup a little faster than explaining it in comments? I'll still comment on major things that have value to see but whitespace stuff seems it might become tedious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking close. seeing a couple of minor bugs and a few more fixes. looks like I can edit in GitHub so don't hesitate to ask questions if you have any.
src/connectors/meraki_devices.py
Outdated
|
||
def ingest(table_name, options): | ||
ingest_type = '' | ||
if (table_name.endswith('CONNECTION_CLIENT')) or (table_name.endswith('connection_client')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what Greg was meaning to explain here earlier is that no table that ends with anything other than "_CONNECTION"
will ever be passed into the ingest function. the runner that runs these ingest function is pretty bare-bones, check it out here.
it's not super apparent, but it also guarantees that the connections will be upper case, so you here you can do something like
ingest_type = 'client' if table_name.endswith('_CLIENT_CONNECTION') else 'device'
but you do also need to make sure that the tables are created ending in _CONNECTION
above so that the runner knows they are "connection tables", i.e. should be populated by the connector runners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I know the answer already, but does the table name have to be all-caps? (I'm guessing yes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends. If you're working with the string in python, then yes; python is case-sensitive and so the strings '_connection' and '_CONNECTION' are not identical.
If you're working with the table in sql, then no; sql is generally case insensitive (more precisely, it will uppercase everything you say unless you explicitly tell it not to) and it will view select * from data.meraki_connection
the same as SELECT * FROM DATA.MERAKI_CONNECTION
.
Note that you can get around this by putting something in double quotes inside sql; that is,
select * from "DATA"."MERAKI_CONNECTION"
is not the same as select * from "data"."meraki_connection"
; one of those will look for a table with an uppercase name and one will look for a table with a lowercase name.
Does that answer your question? Happy to clarify if the above is confusing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be annoying if I put:
ingest_type = 'client' if (table_name.endswith('_CLIENT_CONNECTION')or table_name.endswith('_client_connection')) else 'device'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What situation are you trying to protect against? Snowflake defaults to uppercase names, so you have to explicitly specify that you want an object to have a lowercase name by putting it in double-quotes. We don't do that in the connectors - do you have reason to think that somebody might create a connection table manually and explicitly name it "meraki_client_connection", and still want the connector to work with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh, I think I understand. Thank you for clarifying! I'm not worried about lower-case confusion anymore.
src/connectors/meraki_devices.py
Outdated
|
||
|
||
def connect(connection_name, options): | ||
landing_table_client = f'data.meraki_devices_{connection_name}_connection_client' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the suffix here to _client_connection
src/connectors/meraki_devices.py
Outdated
|
||
def connect(connection_name, options): | ||
landing_table_client = f'data.meraki_devices_{connection_name}_connection_client' | ||
landing_table_device = f'data.meraki_devices_{connection_name}_connection_device' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the suffix here to _device_connection
src/connectors/meraki_devices.py
Outdated
('DHCP_HOSTNAME', 'VARCHAR(256)'), | ||
('IP', 'VARCHAR(256)'), | ||
('SWITCHPORT', 'VARCHAR(256)'), | ||
('VLAN', 'INT'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and in a couple of other places, there are stray spaces before the end of the line, which violates pep8 standards. could you set your text editor to remove extra spaces at the end of each line on save? to do it in Sublime, you can add this to preferences —
"trim_trailing_white_space_on_save": true,
happy to look up in other editors if you're not sure how
src/connectors/meraki_devices.py
Outdated
|
||
from datetime import datetime | ||
|
||
import snowflake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove unused import -- pyflakes will help you spot these, as well. let me know if you'd like help setting it up with your text editor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if flake8 works for this purpose?
src/connectors/meraki_devices.py
Outdated
whitelist = options['network_id_whitelist'] | ||
|
||
for network in whitelist: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra blank line
src/connectors/meraki_devices.py
Outdated
def connect(connection_name, options): | ||
landing_table_client = f'data.meraki_devices_{connection_name}_connection_client' | ||
landing_table_device = f'data.meraki_devices_{connection_name}_connection_device' | ||
options['network_id_whitelist'] = options.get('network_id_whitelist', '').split(',') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will return a Union[str, list]
when I think you'd be better off with just a list
, i.e. put []
in the second param of get
.
also, up to you if you're comfortable, but I would have made this a change to connectors_runner.py
and added the list
type to be parsed there next to json
. later on, we can also add a special UI element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I have a few questions. This function "returns" a dictionary with a message but it creates a table and grants roles. What are you referring to when you say "This will return a Union[str, list]
"? Are you referring to options
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would adding the list type in connectors_runner.py parse the user input as a list? When we tested this particular input, we got it as a string, like 'id1, id2, id3'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this look? @andrey-snowflake
@greg-snowflake #306
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thanks.
Err, had a bit of a think-o. The "return" type I was thinking of was of the expression
options.get('network_id_whitelist', '').split(',')
but since it splits the empty string, it's always a list.
src/connectors/meraki_devices.py
Outdated
db.execute(f'GRANT INSERT, SELECT ON {landing_table_client} TO ROLE {SA_ROLE}') | ||
|
||
db.create_table(name=landing_table_device, | ||
cols=LANDING_TABLE_COLUMNS_DEVICE, comment=comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto -- it's easier to read if each param gets its own line
@greg-snowflake @andrey-snowflake: I think I'm done making changes. How does this look now? Given my pull request #306 |
No description provided.