Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arthur/ssh check #1117

Merged
merged 15 commits into from
Sep 23, 2014
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions checks.d/ssh_check.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# stdlib
import time
import socket
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning the imports we usually try to put them in 3 separate categories:

# stdlib
# 3p
# project

# 3p
import paramiko
from collections import namedtuple
# project
from checks import AgentCheck

class CheckSSH(AgentCheck):

OPTIONS = [
('host', True, None, str),
('port', False, 22, int),
('username', True, None, str),
('password', False, None, str),
('private_key_file', False, None, str),
('sftp_check', False, True, bool),
('add_missing_keys', False, False, bool),
]

Config = namedtuple('Config', [
'host',
'port',
'username',
'password',
'private_key_file',
'sftp_check',
'add_missing_keys',
]
)
def _load_conf(self, instance):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can refactor and make them clearer by using namedtuples:

from collections import namedtuple
OPTIONS = [
    ('host', True, None, str),
    ('port', False, 22, int),
    ('username', True, None, str),
    ('password', False, None, str),
    ('private_key', False, None, str),
    ('sftp_check', False, True, bool),
]

Config = namedtuple('Config', [
            'host',
            'port',
            'username',
            'password',
            'private_key',
            'sftp_check',
        ]
        )
    def _load_conf(self, instance):
        params = []
        for option, required, default, expected_type in REQUIRED:
            value = instance.get(option)
            if required and not value or type(value) != expected_type :
                raise Exception("Please specify a valid {0}".format(option))


            if value is None or type(value) != expected_type:
                self.log.debug("Bad or missing value for {0} parameter. Using default".format(option))

            params.append(value)
        return Config._make(params)

params = []
for option, required, default, expected_type in self.OPTIONS:
value = instance.get(option)
if required and (not value or type(value)) != expected_type :
raise Exception("Please specify a valid {0}".format(option))

if value is None or type(value) != expected_type:
self.log.debug("Bad or missing value for {0} parameter. Using default".format(option))
value = default

params.append(value)
return self.Config._make(params)

def check(self, instance):
host, port, username, password, private_key_file, sftp_check, add_missing_keys = self._load_conf(instance)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't unpack the conf like that. It defeats the purpose of using a namedtuple.

You should do:

conf = self._load_conf(instance)

and then if you want to get the host you can just get it using:

conf.host


try:
private_key = paramiko.RSAKey.from_private_key_file (private_key_file)
except Exception:
self.log.debug("Private Key not found")
private_key = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if the private_key parameter is set, and getting the private_key from the path is failing, we should raise an exception as it means that there is something weird in the configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is possible that the connection can still work with a bad private key if the password and username is set

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is but it would be weird to configure the agent like that. We should at least display a warning using self.warning.

Also if no password is set we can then raise an exception as we wouldn't have neither a password nor a private key set.


client = paramiko.SSHClient()
if add_missing_keys:
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.load_system_host_keys()

exception_message = None
#Service Availability to check status of SSH
try:
client.connect(host, port=port, username=username, password=password, pkey=private_key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the line for which you want to report a service check ssh.can_connect I guess

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to perform a service check for ssh as well? (client did not require it)

self.service_check('ssh.can_connect', AgentCheck.OK, message=exception_message)

except Exception as e:
exception_message = str(e)
self.service_check('ssh.can_connect', AgentCheck.CRITICAL, message=exception_message)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should stop the code execution here by raising an exception and also send the sftp service check message here too.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to raise an exception here to stop the execution and also send the service check status for the sftp_check if it's enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we would want to stop the execution, shouldn't that only occur on something critical that would prevent the check from occurring? We have feedback in the exception message to allow the user know if they did something wrong like a wrong password.

The sftp_check does will still do the service check if this situation occurs

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can't connect to SSH, all the rest will fail. You shouldn't rely on the rest of the code to do nothing, that's why you should explicitly raise the exception.

In addition, raising an exception will make the exception message to be displayed in the agent info page.


#Service Availability to check status of SFTP
if sftp_check:
if exception_message is None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't be needed if you raise an exception on ssh connection failure.

try:
sftp = client.open_sftp()
#Check response time of SFTP
start_time = time.time()

result = sftp.listdir('.')
status = AgentCheck.OK
end_time = time.time()
time_taken = end_time - start_time
self.gauge('sftp.response_time', time_taken)

except Exception as e:
exception_message = str(e)
status = AgentCheck.CRITICAL

else:
status = AgentCheck.CRITICAL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this state mean? You will report a message 'No errors' but the status as critical.

exception_message = "Failed because of SSH: " + exception_message

if exception_message is None:
exception_message = "No errors occured"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove these trailing newlines?

self.service_check('sftp.can_connect', status, message=exception_message)
11 changes: 11 additions & 0 deletions conf.d/ssh_check.yaml.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
init_config:

instances:

- host: localhost #required, must be filled
port: 22 #optional, leaving blank defaults to port 22
username: test #required, must be filled
password: abcd #optional
sftp_check: True #optional, leaving blank defaults to True
private_key_file: #optional, file path to private key
add_missing_keys: True #optional, leaving blank defaults to False
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ ntplib
httplib2
kafka-python==0.9.0-9bed11db98387c0d9e456528130b330631dc50af
requests
paramiko
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.
You'll also have to add that dependency to the omnibus build:
https://github.com/DataDog/dd-agent-omnibus/blob/master/config/projects/datadog-agent.rb#L72

and create a software definition for it similar to that:
https://github.com/DataDog/omnibus-software/blob/master/config/software/pymongo.rb