copyright | lastupdated | ||
---|---|---|---|
|
2018-11-26 |
{:shortdesc: .shortdesc} {:new_window: target="_blank"} {:codeblock: .codeblock} {:pre: .pre} {:screen: .screen} {:tip: .tip}
{: #monitoring}
Monitoring an {{site.data.keyword.cfee_full}} instance and its supported infrastructure is supported by an open-source toolset consisting of Prometheus and Grafana. The solution enables you to analyze, visualize and manage alerts for metrics in the Cloud Foundry environment. There are three web consoles from which monitoring takes place: A Grafana console, a Prometheus console, and a Prometheus Alert Manager console.
Note: Access to the monitoring capability in an {{site.data.keyword.cfee_full}} instance requires an Administrator or Editor role in that Kubernetes cluster supporting the CFEE instance. The default name of the Kubernetes cluster supporting a CFEE instance is <CFEEname>
-cluster.
{: #prometheus}
Prometheus is an open-source systems monitoring and alerting toolkit. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is a standalone open source project and maintained independently of any company. To emphasize this, and to clarify the project's governance structure, Prometheus joined the Cloud Native Computing Foundation in 2016 as the second hosted project, after Kubernetes. See the Prometheus documentation for more information.
The Prometheus ecosystem consists of multiple components, many of which are optional:
- The main Prometheus server which scrapes and stores time series data.
- Alertmanager to handle alerts.
- Various special-purpose exporters like node exporter, blackbox exporter, etc.
- A push gateway for supporting short-lived jobs.
Prometheus gathers metrics from instrumented jobs, either directly or via an intermediary push gateway for short-lived jobs. It stores all gathered samples locally and runs rules over this data to either aggregate and record new time series from existing data, or to generate alerts.
{: #grafana}
Grafana is an open-source analytics platform for all metrics collected by Prometheus. The deployed Grafana version on your cluster is already configured to use the underlying Prometheus database. It also contains some valuable Grafana dashboards. See the Grafana documentation for more information.
{: #gettingStarted_monitor}
The Prometheus and Grafana components comprising the monitoring solution are pre-installed in the Kubernetes infrastructure supporting the CFEE instance. To access the monitoring tools require to forward the ports of the Prometheus, Prometheus AlertManager, and Grafana servers. This is done through the Kubernetes CLI. The following guides you through the steps for installing the required CLI's, forwarding the server ports, and launching the consoles.
Note: The following instructions are also available in the {{site.data.keyword.cfee_full}} user interface. Open the CFEE instance user interface and click Monitoring in the left navigation pane to see the instructions displayed.
- Check your Access Policies to ensure that you have at least a Viewer role on the Kubernetes cluster supporting the environment.
- Install the IBM Cloud CLI.
- Install the Kubernetes CLI. If you have an existing Kubernetes CLI, we recommend that you install the latest version.
- Install the container service plug-in:
ibmcloud plugin install container-service -r Bluemix
- Log into your IBM Cloud account:
ibmcloud login -a https://api.ng.bluemix.net
{: pre}
If you have a federated ID, use ibmcloud login -sso to log into the IBM Cloud CLI.
- Target the IBM Cloud Container Service region in which you want to work (e.g., us-south):
ibmcloud cs region-set us-south
{: pre}
- Set the context of the cluster in your cli:
a. Get the command to set the environment variable and download the Kubernetes configuraton files:
ibmcloud cs cluster-config <CFEE_instance_name>-cluster
{: pre}
b. Set the KUBECONFIG environment variable. Copy the output from the previous command and paste it in your terminal. The command output should look similar to the following: export KUBECONFIG=/Users/$USER/.bluemix/plugins/container-service/clusters/cf-admin-0703-cluster/kube-config-dal10-cf-admin-0703-cluster.yml
- Set up port-forwarding in the Kubernetes cluster for the pods running Prometheus, AlertManager and Grafana. This will enable you to host the monitoring metrics by proxy on your local machine (localhost):
sh -c 'kubectl -n monitoring port-forward deployment/prometheus-server 9090 & kubectl -n monitoring port-forward deployment/prometheus-alertmanager 9093 & kubectl -n monitoring port-forward deployment/grafana 3000'
{: pre}
-
Launch the Grafana console to see analytics on selected metrics. There are default Grafana dashboards included in the CFEE instance. Those default dashboards are interactive and give you a view of the infrastructure used to host your CFEE instance (Kubernetes cluster). Once you launch the Grafana console, click the Home button at the top of the Grafana console to select one of the pre-deployed dashboards (see list below), which will graph the corresponding metrics:
There is a default
admin
user in Grafana, with the default password set toadmin
. We recommend to login with Userd/Passwordadmin/admin
, and change them to new credentials:The following default dashboards are provided with the CFEE instance and are available from the Home dropdown.
Cloud Foundry dashboards:
- CF: Cells Capacity
- Shows the general status of the Cloud Foundry cells where the Cloud Foundry applications are deployed.
- CF: Diego Cell dashboard
- Shows the status of the Cloud Foundry cells and Diego components.
- CF: Router
- Shows the Cloud Foundry router status running on your CFEE environment.
Dashboards for the Kubernetes infrastructure supporting your CFEE environment:
- Deployment
- Shows the status of your Kubernetes deployments.
- Kubernetes Cluster Health
- Shows the health of the Kubernetes cluster.
- Kubernetes Cluster Status
- Shows the status of the Kubernetes cluster.
- Kubernetes Resource Requests
- Shows the used CPU, memory and other parameters of the Kubernetes cluster.
- Pods
- Shows details for each pod running on the Kubernetes cluster.
- Replica Set
- Shows the status of the Kubernetes replica sets.
- Worker Nodes
- Shows details for each worker node of the Kubernetes cluster.
- Worker Nodes Overview
- Shows the CPU and memory usage of the kubernetes infrastructure, along with its network traffic.
- CF: Cells Capacity
-
Optionally, you can also launch the Prometheus console to see the raw data collected by the Prometheus server, and the Prometheus Alertmanager to manage the alerts sent by the Prometheus server: