-
Notifications
You must be signed in to change notification settings - Fork 164
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding prometheus , otel mornitoring setup and docs (#168)
LGTM
- Loading branch information
1 parent
566714b
commit b06d739
Showing
7 changed files
with
159 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
version: "3" | ||
services: | ||
otel-collector: | ||
container_name: "otel-collector" | ||
image: otel/opentelemetry-collector-contrib | ||
restart: always | ||
command: | ||
- --config=/etc/otelcol-contrib/otel-config.yaml | ||
volumes: | ||
- ./otel-collector-config.yaml:/etc/otelcol-contrib/otel-config.yaml | ||
ports: | ||
- "1888:1888" # pprof extension | ||
- "8888:8888" # Prometheus metrics exposed by the Collector | ||
- "8889:8889" # Prometheus exporter metrics | ||
- "13133:13133" # health_check extension | ||
- "4317:4317" # OTLP gRPC receiver | ||
- "4318:4318" # OTLP http receiver | ||
- "55679:55679" # zpages extension | ||
|
||
prometheus: | ||
image: prom/prometheus | ||
container_name: prometheus | ||
restart: always | ||
command: | ||
- --config.file=/etc/prometheus/prometheus.yml | ||
ports: | ||
- "9090:9090" | ||
volumes: | ||
- ./prometheus.yml:/etc/prometheus/prometheus.yml | ||
- prometheus-data:/prometheus | ||
|
||
grafana: | ||
container_name: grafana | ||
image: grafana/grafana | ||
volumes: | ||
- ./grafana-datasources.yml:/etc/grafana/provisioning/datasources/datasources.yml | ||
ports: | ||
- "3000:3000" | ||
|
||
volumes: | ||
prometheus-data: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
apiVersion: 1 | ||
|
||
datasources: | ||
- name: Prometheus | ||
type: prometheus | ||
uid: prometheus | ||
access: proxy | ||
orgId: 1 | ||
url: http://prometheus:9090 | ||
basicAuth: false | ||
isDefault: false | ||
version: 1 | ||
editable: false | ||
jsonData: | ||
httpMethod: GET |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
receivers: | ||
otlp: | ||
protocols: | ||
http: | ||
|
||
processors: | ||
# batch metrics before sending to reduce API usage | ||
batch: | ||
|
||
exporters: | ||
prometheus: | ||
endpoint: "0.0.0.0:8889" | ||
const_labels: | ||
label: juno | ||
|
||
|
||
# https://github.com/open-telemetry/opentelemetry-collector/blob/main/extension/README.md | ||
extensions: | ||
# responsible for responding to health check calls on behalf of the collector. | ||
health_check: | ||
# fetches the collector’s performance data | ||
pprof: | ||
# serves as an http endpoint that provides live debugging data about instrumented components. | ||
zpages: | ||
|
||
service: | ||
extensions: [health_check, pprof, zpages] | ||
pipelines: | ||
metrics: | ||
receivers: [otlp] | ||
processors: [batch] | ||
exporters: [prometheus] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
global: | ||
scrape_interval: 10s | ||
evaluation_interval: 10s | ||
|
||
scrape_configs: | ||
- job_name: 'otel-collector' | ||
static_configs: | ||
- targets: ['otel-collector:8889'] # Otlp | ||
# uncomment to enable prometheus metrics | ||
#- job_name: 'prometheus' | ||
# static_configs: | ||
# - targets: ['localhost:9090'] # Prometheus itself |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
## Monitor Juno Metrics using Prometheus | ||
|
||
#### A simple setup to push the metrics on prometheus using otel-collector is shown below. Grafana can be further used to create visualizations from the available metrics. | ||
|
||
<img | ||
src="otel_mon.png" | ||
style="display: margin: 0 auto;"> | ||
|
||
#### Setup | ||
|
||
##### Configure proxy and storage to push metrics to otel endpoint | ||
|
||
- Juno proxy and storage services are configured to push the metrics on open telemetry collector endpoint http://localhost:4318/v1/metrics . Add/Update the [OTEL] section in the respective config.toml files | ||
|
||
```yaml | ||
[OTEL] | ||
Enabled = true | ||
Environment = "qa" | ||
Host = "0.0.0.0" | ||
Poolname = "junoserv-ai" | ||
Port = 4318 | ||
Resolution = 10 | ||
UrlPath = "/v1/metrics" | ||
UseTls = false | ||
|
||
``` | ||
|
||
- Now the proxy and storage services are uploading metrics to otel endpoint. | ||
|
||
##### Set up otel-collector, prometheus and grafana | ||
- Open telemetry collector, prometheus and grafana are run as docker containers. | ||
- otel-collector , prometheus and grafana configurations are required to be mounted as volumes in the containers | ||
- docker-compose.yaml and configuration files for each of the services available in junodb/docker/monitoring | ||
|
||
|
||
```bash | ||
cd junodb/docker/monitoring | ||
|
||
docker compose up -d | ||
``` | ||
|
||
- Check the running containers. prometheus, otel-collector and grafana should be running | ||
|
||
```bash | ||
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES | ||
bcb1e7ece6b7 prom/prometheus "/bin/prometheus --c…" 3 hours ago Up 3 hours 0.0.0.0:9090->9090/tcp prometheus | ||
c3816c006f85 otel/opentelemetry-collector-contrib "/otelcol-contrib --…" 3 hours ago Up 3 hours 0.0.0.0:1888->1888/tcp, 0.0.0.0:4317-4318->4317-4318/tcp, 0.0.0.0:8888-8889->8888-8889/tcp, 0.0.0.0:13133->13133/tcp, 0.0.0.0:55679->55679/tcp, 55678/tcp otel-collector | ||
e41e33696606 grafana/grafana "/run.sh" 3 hours ago Up 3 hours 0.0.0.0:3000->3000/tcp grafana | ||
|
||
``` | ||
|
||
- Check the promethus server running at <host_ip>:9090 as shown below. Search for juno metrics. | ||
|
||
<img | ||
src="prometheus.png" | ||
style="display: margin: 0 auto;"> | ||
|
||
|
||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.