A collection of scripts designed to aide in data gather and analysis of OpenShift 4 Cluster traffic and routing.
It is always best to open a support case to discuss with an engineer about the findings from your own cluster, and these scripts should be considered supplemental to an analysis in partnership with Red Hat. However, they are available for an initial look at your systems performance, and may be recommended as part of data-gathering efforts to streamline debugging and deeper understanding on haproxy handling for your openshift cluster.
This repository and all included scripts are debugging tools provided without warranty. They offer no support from Red Hat or any other official source. Please use at your own risk.
Can be used to gather a summary output of all router-pods statistics from your cluster at once along with haproxy.config for help in understanding more about how your cluster is routing traffic to it's backend pods.
Note that router pod hit statistics being gathered are subject to intermittent log clears, and so repeated gathers may be necessary to get a fully comprehensive view of activity. Stats are not persistent even through the lifespan of a container and are cleared frequently; They are reset to 0 every time haproxy.config reloads. You may find that this occurs very often in a cluster with a lot of churn/new pods/deployments - 5s. Or longer if deployments are static and routes aren't changing you may get a much larger sample size. You may need to run the haproxy-gather.sh script a few times to get a solid sample depending on your cluster configuration/time between test calls and gather run.
Usage on 3.11 versus 4.x: Set the values in the script to run on 3.11 or 4.x clusters (4.x set as default)
namespace="openshift-ingress" #If running on 3.11='default', 4.x='openshift-ingress'
selector="default" #If running on 3.11='router' #4.x='default'
See https://access.redhat.com/solutions/6987555 for more information on how to analyze contents pulled by haproxy-gather, and refer to the documentation at https://docs.openshift.com/container-platform/4.11/networking/routes/route-configuration.html#nw-route-specific-annotations_route-configuration for more on how to streamline and optimize your routes. This script is designed primarily to aggregate data for easier troubleshooting efforts.
This is a fast-analysis script that will grab the lbtot hit counter for your desired route query, the http response codes for each backend, the annotations recorded by haproxy.config for confirming load-balance type and similar, and will also compile a summary table for all pods associated with the route for analysis. Run this script inside the resulting haproxy-gather/
folder generated by haproxy-gather.sh
, specifying the haproxy.config file you are examining and route you wish to examine, for example default_haproxy.config
and mytestroute
).
A simple count check on how many back-ends are in use per haproxy.config (and therefore, sharded instance of your router deployments); good for validating total expected load on the pods, and may also provide insight into how much work these pods are doing if observing that there is latency during haproxy.config refresh or otherwise.
A basic script to run a curl against a target address with logging parameter options. Designed to catch an unexpected response like a 502 error.
usage: curl_loop_until_error.sh <url>
(can also modify the script directly to include additional parameters like payloads/POST type instead of GET). Helpful when we need to catch a rare occurrance and can't watch for explicit timeframes.
A simple UBI dockerfile and companion pod.yaml that deploy a test suite pod to enable curl tests and other network config requests to replicate and troubleshoot problems on OpenShift