frontend stats bind *:8880 http-request use-service prometheus-exporter if { path /metrics } stats enable stats uri /stats stats refresh 10s
Permit getting stats via prometheus exporter (since haproxy 2.0)
First, configure your haproxy to expose frontend stats page like this :
frontend stats
bind *:8880
http-request use-service prometheus-exporter if { path /metrics }
stats enable
stats uri /stats
stats refresh 10s
Then, check the item haproxy.prometheus.allmetrics URL.
Discovery for frontends, backends, servers.
- Check if <50% of backend are down (not up, like DRAIN, NOLB, etc )on a frontend.
- Check if connections are >90/95% of the limit on frontend
- Check status of frontend and backend.
There are no macros links in this template.
There are no template links in this template.
Name | Description | Type | Key and additional info |
---|---|---|---|
HAProxy Backend Discover | - |
Dependent item |
haproxy.backend.discovery Update: 0 |
HAProxy Server discovery | - |
Dependent item |
haproxy.server.discovery Update: 0 |
HAProxy Frontend Discover | - |
Dependent item |
haproxy.frontend.discovery Update: 0 |
Name | Description | Type | Key and additional info |
---|---|---|---|
Haproxy Prometheus Metrics | - |
HTTP agent |
haproxy.prometheus.allmetrics Update: 1m |
backend [{#BACKEND_NAME}] active | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,act] Update: 0 LLD |
backend [{#BACKEND_NAME}] backup | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,bck] Update: 0 LLD |
backend [{#BACKEND_NAME}] backend up-down transitions | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,chkdown] Update: 0 LLD |
backend [{#BACKEND_NAME}] downtime | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,downtime] Update: 0 LLD |
backend [{#BACKEND_NAME}] Number of available servers | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,nb_available_servers_by_backend] Update: 0 LLD |
backend [{#BACKEND_NAME}] Number of servers | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,nb_servers_by_backend] Update: 0 LLD |
backend [{#BACKEND_NAME}] current sessions | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,scur] Update: 0 LLD |
backend [{#BACKEND_NAME}] max sessions | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,smax] Update: 0 LLD |
backend [{#BACKEND_NAME}] status | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},BACKEND,status] Update: 0 LLD |
Server [{#BACKEND_NAME}/{#SERVER_NAME}] backend server up-down transitions | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},{#SERVER_NAME},chkdown] Update: 0 LLD |
Server [{#BACKEND_NAME}/{#SERVER_NAME}] backend server downtime | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},{#SERVER_NAME},downtime] Update: 0 LLD |
Server [{#BACKEND_NAME}/{#SERVER_NAME}] backend server status | - |
Dependent item |
haproxy.stats[{#BACKEND_NAME},{#SERVER_NAME},status] Update: 0 LLD |
frontend [{#FRONTEND_NAME}] current sessions | - |
Dependent item |
haproxy.stats[{#FRONTEND_NAME},FRONTEND,scur] Update: 0 LLD |
frontend [{#FRONTEND_NAME}] session limit | - |
Dependent item |
haproxy.stats[{#FRONTEND_NAME},FRONTEND,slim] Update: 0 LLD |
frontend [{#FRONTEND_NAME}] max sessions | - |
Dependent item |
haproxy.stats[{#FRONTEND_NAME},FRONTEND,smax] Update: 0 LLD |
frontend [{#FRONTEND_NAME}] status | - |
Dependent item |
haproxy.stats[{#FRONTEND_NAME},FRONTEND,status] Update: 0 LLD |
Name | Description | Expression | Priority |
---|---|---|---|
Backend {#BACKEND_NAME} is degraded | - |
Expression: (last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,nb_available_servers_by_backend]) / last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,nb_servers_by_backend]) ) <0.5 Recovery expression: |
average |
Backend {#BACKEND_NAME} number of down state change | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,chkdown])>last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,chkdown],#1) Recovery expression: |
high |
Backend {#BACKEND_NAME} state is not UP | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,status],#1)<>1 Recovery expression: |
high |
Frontend {#FRONTEND_NAME} current connexion > 90% of limit | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,scur]) * 100 / last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,slim]) > 90 Recovery expression: |
average |
Frontend {#FRONTEND_NAME} current connexion > 95% of limit | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,scur]) * 100 / last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,slim]) > 95 Recovery expression: |
high |
Frontend {#FRONTEND_NAME} state is not OPEN | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,status],#1)<>1 Recovery expression: |
high |
Backend {#BACKEND_NAME} is degraded (LLD) | - |
Expression: (last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,nb_available_servers_by_backend]) / last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,nb_servers_by_backend]) ) <0.5 Recovery expression: |
average |
Backend {#BACKEND_NAME} number of down state change (LLD) | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,chkdown])>last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,chkdown],#1) Recovery expression: |
high |
Backend {#BACKEND_NAME} state is not UP (LLD) | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#BACKEND_NAME},BACKEND,status],#1)<>1 Recovery expression: |
high |
Frontend {#FRONTEND_NAME} current connexion > 90% of limit (LLD) | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,scur]) * 100 / last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,slim]) > 90 Recovery expression: |
average |
Frontend {#FRONTEND_NAME} current connexion > 95% of limit (LLD) | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,scur]) * 100 / last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,slim]) > 95 Recovery expression: |
high |
Frontend {#FRONTEND_NAME} state is not OPEN (LLD) | - |
Expression: last(/_T_Zbx_Lin_HAPROXY2_stats_Prometheus/haproxy.stats[{#FRONTEND_NAME},FRONTEND,status],#1)<>1 Recovery expression: |
high |