Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rules in case of server=<UUID> #12

Closed
filippog opened this issue Dec 16, 2016 · 1 comment
Closed

Rules in case of server=<UUID> #12

filippog opened this issue Dec 16, 2016 · 1 comment

Comments

@filippog
Copy link
Contributor

Following up from #8

In Varnish 4 varnishstat -j will return VCL UUIDs in non-cold state (from varnishadm vcl.list) (e.g. busy or warm) and in turn the UUID is parsed as value for server Prometheus tag.
In Varnish 4 each time a VCLs is compiled and loaded it is assigned a new UUID, in cases where VCLs are reloaded frequently this can cause metric churn in the Prometheus server and the server tag itself isn't very meaningful in most cases.

varnish_backend_req{backend="be_ununpentium_wikimedia_org",server="02610319-7c2e-4154-8298-b991fc0b1042"} 0
varnish_backend_req{backend="be_ununpentium_wikimedia_org",server="9ffc72df-697e-42d1-8598-a06497f56ace"} 0
varnish_backend_req{backend="be_ununpentium_wikimedia_org",server="c558b103-d1ca-4a27-b529-5492e21de38e"} 0

In these cases my suggestion would be to recommend to users a set of recording rules to use to get meaningful aggregated metrics per-backend instead, e.g.

backend:varnish_backend_bereq_bodybytes:sum = sum(varnish_backend_bereq_bodybytes) without (server)
backend:varnish_backend_bereq_hdrbytes:sum = sum(varnish_backend_bereq_hdrbytes) without (server)
backend:varnish_backend_beresp_bodybytes:sum = sum(varnish_backend_beresp_bodybytes) without (server)
backend:varnish_backend_beresp_hdrbytes:sum = sum(varnish_backend_beresp_hdrbytes) without (server)
backend:varnish_backend_conn:sum = sum(varnish_backend_conn) without (server)
backend:varnish_backend_happy:sum = sum(varnish_backend_happy) without (server)
backend:varnish_backend_pipe_hdrbytes:sum = sum(varnish_backend_pipe) without (server)
backend:varnish_backend_pipe_in:sum = sum(varnish_backend_pipe_in) without (server)
backend:varnish_backend_pipe_out:sum = sum(varnish_backend_pipe_out) without (server)
backend:varnish_backend_req:sum = sum(varnish_backend_req) without (server)
@jonnenauha
Copy link
Owner

jonnenauha commented Dec 19, 2016

Where does these recording rules go? I'm presuming they go to prometheus config?

Would you like to make a pull request to the README.md that has this explained and details where to put those recording rules to avoid metrics churn.

There could be new title

P.S. I now think I made the graphana queries do this merge for me, or at least I dropped the server label so it did not show up in the dashboards. But your way is probably better.

Configuration considerations

Varnish reports inactive VCLs for some time after VCL reloads. This exporter identifies VCL instances with the server label. For setups that frequently reload VCLs this can cause metrics churn and hard to read dashboards. Here is how to sum up all VCLs by backend name, ignoring the server label. This is optional if you rarely reload VCLs, the inactive VCLs will be removed by Varnish after some time.

the rules and info where they should be put

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants