Deploy Prometheus monitoring system using ansible.
- Ansible >= 2.7 (It might work on previous versions, but we cannot guarantee it)
- jmespath on deployer machine. If you are using Ansible from a Python virtualenv, install jmespath to the same virtualenv via pip.
- gnu-tar on Mac deployer host (
brew install gnu-tar
)
All variables which can be overridden are stored in defaults/main.yml file as well as in table below.
Name | Default Value | Description |
---|---|---|
prometheus_version |
2.27.0 | Prometheus package version. Also accepts latest as parameter. Only prometheus 2.x is supported |
prometheus_skip_install |
false | Prometheus installation tasks gets skipped when set to true. |
prometheus_binary_local_dir |
"" | Allows to use local packages instead of ones distributed on github. As parameter it takes a directory where prometheus AND promtool binaries are stored on host on which ansible is ran. This overrides prometheus_version parameter |
prometheus_config_dir |
/etc/prometheus | Path to directory with prometheus configuration |
prometheus_db_dir |
/var/lib/prometheus | Path to directory with prometheus database |
prometheus_read_only_dirs |
[] | Additional paths that Prometheus is allowed to read (useful for SSL certs outside of the config directory) |
prometheus_web_listen_address |
"0.0.0.0:9090" | Address on which prometheus will be listening |
prometheus_web_config |
{} | A Prometheus web config yaml for configuring TLS and auth. |
prometheus_web_external_url |
"" | External address on which prometheus is available. Useful when behind reverse proxy. Ex. http://example.org/prometheus |
prometheus_storage_retention |
"30d" | Data retention period |
prometheus_storage_retention_size |
"0" | Data retention period by size |
prometheus_config_flags_extra |
{} | Additional configuration flags passed to prometheus binary at startup |
prometheus_alertmanager_config |
[] | Configuration responsible for pointing where alertmanagers are. This should be specified as list in yaml format. It is compatible with official <alertmanager_config> |
prometheus_alert_relabel_configs |
[] | Alert relabeling rules. This should be specified as list in yaml format. It is compatible with the official <alert_relabel_configs> |
prometheus_global |
{ scrape_interval: 60s, scrape_timeout: 15s, evaluation_interval: 15s } | Prometheus global config. Compatible with official configuration |
prometheus_remote_write |
[] | Remote write. Compatible with official configuration |
prometheus_remote_read |
[] | Remote read. Compatible with official configuration |
prometheus_external_labels |
environment: "{{ ansible_fqdn | default(ansible_host) | default(inventory_hostname) }}" | Provide map of additional labels which will be added to any time series or alerts when communicating with external systems |
prometheus_targets |
{} | Targets which will be scraped. Better example is provided in our demo site |
prometheus_scrape_configs |
defaults/main.yml#L58 | Prometheus scrape jobs provided in same format as in official docs |
prometheus_config_file |
"prometheus.yml.j2" | Variable used to provide custom prometheus configuration file in form of ansible template |
prometheus_alert_rules |
defaults/main.yml#L81 | Full list of alerting rules which will be copied to {{ prometheus_config_dir }}/rules/ansible_managed.rules . Alerting rules can be also provided by other files located in {{ prometheus_config_dir }}/rules/ which have *.rules extension |
prometheus_alert_rules_files |
defaults/main.yml#L78 | List of folders where ansible will look for files containing alerting rules which will be copied to {{ prometheus_config_dir }}/rules/ . Files must have *.rules extension |
prometheus_static_targets_files |
defaults/main.yml#L78 | List of folders where ansible will look for files containing custom static target configuration files which will be copied to {{ prometheus_config_dir }}/file_sd/ . |
prometheus_targets
is just a map used to create multiple files located in "{{ prometheus_config_dir }}/file_sd" directory. Where file names are composed from top-level keys in that map with .yml
suffix. Those files store file_sd scrape targets data and they need to be read in prometheus_scrape_configs
.
A part of prometheus.yml configuration file which describes what is scraped by prometheus is stored in prometheus_scrape_configs
. For this variable same configuration options as described in prometheus docs are used.
Meanwhile prometheus_targets
is our way of adopting prometheus scrape type file_sd
. It defines a map of files with their content. A top-level keys are base names of files which need to have their own scrape job in prometheus_scrape_configs
and values are a content of those files.
All this mean that you CAN use custom prometheus_scrape_configs
with prometheus_targets
set to {}
. However when you set anything in prometheus_targets
it needs to be mapped to prometheus_scrape_configs
. If it isn't you'll get an error in preflight checks.
Lets look at our default configuration, which shows all features. By default we have this prometheus_targets
:
prometheus_targets:
node: # This is a base file name. File is located in "{{ prometheus_config_dir }}/file_sd/<<BASENAME>>.yml"
- targets: #
- localhost:9100 # All this is a targets section in file_sd format
labels: #
env: test #
Such config will result in creating one file named node.yml
in {{ prometheus_config_dir }}/file_sd
directory.
Next this file needs to be loaded into scrape config. Here is modified version of our default prometheus_scrape_configs
:
prometheus_scrape_configs:
- job_name: "prometheus" # Custom scrape job, here using `static_config`
metrics_path: "/metrics"
static_configs:
- targets:
- "localhost:9090"
- job_name: "example-node-file-servicediscovery"
file_sd_configs:
- files:
- "{{ prometheus_config_dir }}/file_sd/node.yml" # This line loads file created from `prometheus_targets`
---
- hosts: all
roles:
- sysvale.prometheus
vars:
prometheus_targets:
node:
- targets:
- localhost:9100
- demo.cloudalchemy.org:9100
labels:
env: demosite
Prometheus organization provide a demo site for full monitoring solution based on prometheus and grafana. Repository with code and links to running instances is available on github.
Alerting rules are defined in prometheus_alert_rules
variable. Format is almost identical to one defined in Prometheus 2.0 documentation.
Due to similarities in templating engines, every templates should be wrapped in {% raw %}
and {% endraw %}
statements. Example is provided in defaults/main.yml file.
The preferred way of locally testing the role is to use Docker and molecule (v2.x). You will have to install Docker on your system. See "Get started" for a Docker package suitable to for your system. We are using tox to simplify process of testing on multiple ansible versions. To install tox execute:
pip3 install tox
To run tests on all ansible versions (WARNING: this can take some time)
tox
To run a custom molecule command on custom environment with only default test scenario:
tox -e py35-ansible28 -- molecule test -s default
For more information about molecule go to their docs.
If you would like to run tests on remote docker host just specify DOCKER_HOST
variable before running tox tests.
See troubleshooting.
This project is licensed under MIT License. See LICENSE for more details.