-
Notifications
You must be signed in to change notification settings - Fork 814
Agent Developer Mode
The Agent Developer Mode allows the user to collect a wide array of metrics regarding the performance of the agent itself. It can be enabled by adding developer_mode : yes
to your datadog.conf
. When in developer mode the following functionality is added to the agent:
- Metrics for collection time, emit time and CPU used are sent to Datadog on every collector run.
- The collector loop is profiled using cProfile. At an interval specified by
agentConfig.collector_profile_interval
thePstats
output for the collector loop is dumped tolog.debug
. - An additional check
agent_metrics
is run at the end of every collector loop. This check collects a variety of metrics about the collector's performance, and can be configured with the same interface used to configure regularAgentCheck
s. Source code for this check can be found under checks.d/agent_metrics.py
Here is an example configuration for the agent_metrics
check:
init_config:
process_metrics:
- name: get_memory_info
type: gauge
active: yes
- name: get_io_counters
type: rate
active: yes
- name: get_connections
type: gauge
active: no
instances:
[{}]
Each element in the process_metrics
list represents a single psutil.Process method that will be executed against the running collector process. The name
field specifies the name of the method, the type
field specifies the metric type (currently only gauge and rate are supported), and the active
field is a utility flag to activate/deactivate certain method calls during the check. Note the method specified in name
is executed only when:
- The method is available on the
psutil.Process
class as ofpsutil==2.1.1
- The underlying OS supports the execution of that method (e.g
get_io_counters
is not available for OS X processes)
If the agent_metrics
check cannot execute a particular method, it logs a warning and continues with its business.
Metrics collected via these methods are parsed and aggregated in a namespace derived from the method name.
get_memory_info
-> datadog.agent.collector.memory_info.rss
and datadog.agent.collector.memory_info.vms
. The logic for this lives here and here. These metrics are then aggregated and forwarded to DataDog as with any other AgentCheck
Individual checks can be profiled by adding the --profile
flag to the standard agent.py check
command line call. E.g.: python agent.py check network --profile
.
Profiling information consists of the following:
- Check runtime
- Memory use and Disk I/O if available
- Pstats output restricted to 20 calls.
Here is an example run for profiling the network
check.