Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DC] Meraki Device Connector #290

Merged

Conversation

edulop91
Copy link
Contributor

No description provided.

kuannie1 and others added 30 commits August 27, 2019 09:25
Replacing underscore with actual description

Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
Replacing underscore with actual description

Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>


def connect(connection_name, options):
landing_table_client = f'data.meraki_{connection_name}_connection_client'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table names have to end in _CONNECTION or they're not considered a connection table (i.e. the connection runner won't pick them up and call the ingest method on the module for the table). So you'd want these to be f'data.meraki_{connection_name}_client_connection' or '_device_connection'.

return None


def ingest(table_name, options):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So your connect method creates two tables, both of which I think you want to be connection tables. But the ingest method populates both of them every time it gets called, which means that you're going to be duplicating data in your tables; you'll populate both tables when the connection runner finds the client_connection table, and then you'll populate both tables again a second later when the connection runner finds the device_connection table.

For each call to the ingest method, it should populate one table: the table which ingest is called for.

except HTTPError as http_err:
log.error(f"Error GET: url={url}")
log.error(f"HTTP error occurred: {http_err}")
raise http_err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-raise with just raise, that way the stack trace is maintained rather than being reset to this point

client.get('mdnsName','None'),
client.get('dhcpHostname','None'),
client.get('ip','None'),
parse_number(client.get('vlan','None')),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what the client value is, but agreed this looks buggy. right now it does:

client.get('vlan', 'None') or None

are you sure this is what you want?

client.get('ip','None'),
parse_number(client.get('vlan','None')),
client.get('switchport','None'),
parse_number(client.get('usage','None').get('sent','None')),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't .get on a string (i.e. 'None'), so this will raise an exception if 'usage' not in client

@@ -0,0 +1,181 @@
"""Meraki
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this name a little more descriptive -- there are a lot of things we could potentially collect from Meraki, and the titles should be distinct.

kuannie1 and others added 6 commits September 6, 2019 13:26
Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
Co-Authored-By: Eduardo Lopez <elopez@chanzuckerberg.com>
log.error(e)
continue

db.insert(
Copy link
Collaborator

@sfc-gh-afedorov sfc-gh-afedorov Sep 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I wasn't more clear on this. could you please replace this db.insert expression with --

            db.insert(
                landing_table,
                values=[(
                    timestamp,
                    client,
                    client.get('id'),
                    client.get('mac'),
                    client.get('description'),
                    client.get('mdnsName'),
                    client.get('dhcpHostname'),
                    client.get('ip'),
                    client.get('switchport'),
                    client.get('vlan') or None,
                    client.get('usage', {}).get('sent') or None,
                    client.get('usage', {}).get('recv') or None,
                    serial_number,
                ) for client in clients],
                select=db.derive_insert_select(LANDING_TABLE_COLUMNS_CLIENT),
                columns=db.derive_insert_columns(LANDING_TABLE_COLUMNS_CLIENT)
            )

I think it will do the same thing but more succinctly.

to clarify x or y will return y if x if "falsy", i.e. if bool(x) is False, and dict's .get returns None as the default value if the key is not present

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, are you sure vlan needs to be treated differently than ip or switchport? this seems off.

Copy link
Contributor

@kuannie1 kuannie1 Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes client.get('vlan') will return '' instead of None, which cannot be inserted into the database since the VLAN column requires a number. I like your suggested change though! I just can't treat vlan the same way as ip because that '' value is invalid in our case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, interesting. If you're sure that the others never return '' then lgtm. Adding a comment like --

                client.get('vlan') or None,  # vlan sometimes set to ''

might help to future folks, as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen that occur with usage:recv or usage:sent, but I think it'll be best if we used the same approach for those instances too. Thank you for this feedback!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

values=[(
timestamp,
device,
device.get('serial',None),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove all of these ,None, as they're vestigial (None is the default value when key is not present), and per pep8, the comma in param lists needs to be followed by a space or newline.

comment = yaml_dump(module='meraki_devices', **options)

db.create_table(name=landing_table_client,
cols=LANDING_TABLE_COLUMNS_CLIENT, comment=comment)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a newline before "comment" please to adhere to PEP8

@sfc-gh-afedorov
Copy link
Collaborator

@edulop91 could you give me push access to this repo so I can help push minor cleanup a little faster than explaining it in comments? I'll still comment on major things that have value to see but whitespace stuff seems it might become tedious.

Copy link
Collaborator

@sfc-gh-afedorov sfc-gh-afedorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking close. seeing a couple of minor bugs and a few more fixes. looks like I can edit in GitHub so don't hesitate to ask questions if you have any.


def ingest(table_name, options):
ingest_type = ''
if (table_name.endswith('CONNECTION_CLIENT')) or (table_name.endswith('connection_client')):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what Greg was meaning to explain here earlier is that no table that ends with anything other than "_CONNECTION" will ever be passed into the ingest function. the runner that runs these ingest function is pretty bare-bones, check it out here.

it's not super apparent, but it also guarantees that the connections will be upper case, so you here you can do something like

ingest_type = 'client' if table_name.endswith('_CLIENT_CONNECTION') else 'device'

but you do also need to make sure that the tables are created ending in _CONNECTION above so that the runner knows they are "connection tables", i.e. should be populated by the connector runners.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I know the answer already, but does the table name have to be all-caps? (I'm guessing yes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends. If you're working with the string in python, then yes; python is case-sensitive and so the strings '_connection' and '_CONNECTION' are not identical.

If you're working with the table in sql, then no; sql is generally case insensitive (more precisely, it will uppercase everything you say unless you explicitly tell it not to) and it will view select * from data.meraki_connection the same as SELECT * FROM DATA.MERAKI_CONNECTION.

Note that you can get around this by putting something in double quotes inside sql; that is,

select * from "DATA"."MERAKI_CONNECTION" is not the same as select * from "data"."meraki_connection"; one of those will look for a table with an uppercase name and one will look for a table with a lowercase name.

Does that answer your question? Happy to clarify if the above is confusing!

Copy link
Contributor

@kuannie1 kuannie1 Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be annoying if I put:
ingest_type = 'client' if (table_name.endswith('_CLIENT_CONNECTION')or table_name.endswith('_client_connection')) else 'device' ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What situation are you trying to protect against? Snowflake defaults to uppercase names, so you have to explicitly specify that you want an object to have a lowercase name by putting it in double-quotes. We don't do that in the connectors - do you have reason to think that somebody might create a connection table manually and explicitly name it "meraki_client_connection", and still want the connector to work with it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, I think I understand. Thank you for clarifying! I'm not worried about lower-case confusion anymore.



def connect(connection_name, options):
landing_table_client = f'data.meraki_devices_{connection_name}_connection_client'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the suffix here to _client_connection


def connect(connection_name, options):
landing_table_client = f'data.meraki_devices_{connection_name}_connection_client'
landing_table_device = f'data.meraki_devices_{connection_name}_connection_device'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the suffix here to _device_connection

('DHCP_HOSTNAME', 'VARCHAR(256)'),
('IP', 'VARCHAR(256)'),
('SWITCHPORT', 'VARCHAR(256)'),
('VLAN', 'INT'),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and in a couple of other places, there are stray spaces before the end of the line, which violates pep8 standards. could you set your text editor to remove extra spaces at the end of each line on save? to do it in Sublime, you can add this to preferences —

"trim_trailing_white_space_on_save": true,

happy to look up in other editors if you're not sure how


from datetime import datetime

import snowflake
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove unused import -- pyflakes will help you spot these, as well. let me know if you'd like help setting it up with your text editor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if flake8 works for this purpose?

whitelist = options['network_id_whitelist']

for network in whitelist:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove extra blank line

def connect(connection_name, options):
landing_table_client = f'data.meraki_devices_{connection_name}_connection_client'
landing_table_device = f'data.meraki_devices_{connection_name}_connection_device'
options['network_id_whitelist'] = options.get('network_id_whitelist', '').split(',')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will return a Union[str, list] when I think you'd be better off with just a list, i.e. put [] in the second param of get.

also, up to you if you're comfortable, but I would have made this a change to connectors_runner.py and added the list type to be parsed there next to json. later on, we can also add a special UI element.

Copy link
Contributor

@kuannie1 kuannie1 Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I have a few questions. This function "returns" a dictionary with a message but it creates a table and grants roles. What are you referring to when you say "This will return a Union[str, list]"? Are you referring to options?

Copy link
Contributor

@kuannie1 kuannie1 Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would adding the list type in connectors_runner.py parse the user input as a list? When we tested this particular input, we got it as a string, like 'id1, id2, id3'.

Copy link
Contributor

@kuannie1 kuannie1 Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this look? @andrey-snowflake
@greg-snowflake #306

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks.

Err, had a bit of a think-o. The "return" type I was thinking of was of the expression

options.get('network_id_whitelist', '').split(',')

but since it splits the empty string, it's always a list.

db.execute(f'GRANT INSERT, SELECT ON {landing_table_client} TO ROLE {SA_ROLE}')

db.create_table(name=landing_table_device,
cols=LANDING_TABLE_COLUMNS_DEVICE, comment=comment)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto -- it's easier to read if each param gets its own line

@kuannie1
Copy link
Contributor

kuannie1 commented Sep 9, 2019

@greg-snowflake @andrey-snowflake: I think I'm done making changes. How does this look now? Given my pull request #306

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants