Skip to content

Commit 1eba7d8

Browse files
Add example for monitoring postgres slow queries (#64)
* Add example of docker monitoring * Add postgres monitoring example * Add postgres monitoring example * Update the instructions on permissions
1 parent 75a5d8d commit 1eba7d8

File tree

4 files changed

+308
-0
lines changed

4 files changed

+308
-0
lines changed

postgres/README.md

+186
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
# Monitoring Postgres with OpenTelemetry and Last9
2+
3+
A guide for setting up Postgres monitoring using OpenTelemetry Collector with Last9. It collects Postgres metrics, and logs from Postgres and sends them to Last9.
4+
5+
## Installation
6+
7+
### 1. Prerequisites
8+
9+
Ensure Docker and Docker Compose are installed on your system:
10+
11+
```bash
12+
# Check Docker installation
13+
docker --version
14+
15+
# Check Docker Compose installation
16+
docker compose version
17+
```
18+
19+
### 2. Configure OpenTelemetry Collector
20+
21+
The setup uses the otel-collector-config.yaml file which defines:
22+
Prometheus receiver for scraping Postgres metrics
23+
Processors for batch processing and resource detection
24+
Last9 exporter configuration
25+
Before proceeding, update the Last9 authorization token:
26+
27+
```bash
28+
# Edit the config file
29+
nano otel-collector-config.yaml
30+
```
31+
32+
In the `exporters` section, replace <LAST9_OTLP_AUTH_HEADER> with your actual Last9 authorization auth header and <LAST9_OTLP_ENDPOINT> with the endpoint URL. You can get the auth header from Last9 Integrations.
33+
34+
### 3. Configure Postgres Exporter
35+
36+
Update the environment variables in docker-compose.yaml:
37+
38+
```yaml
39+
Replace the following placeholders with your actual Postgres database information:
40+
<DB_HOST>: Your Postgres database host
41+
<DB_NAME>: Your Postgres database name
42+
<DB_USER>: Your Postgres database username
43+
<DB_PASSWORD>: Your Postgres database password
44+
```
45+
46+
### 4. Start the Monitoring Stack
47+
48+
```bash
49+
docker compose -f docker-compose.yaml up -d
50+
```
51+
52+
This starts:
53+
- Postgres Exporter that collects metrics from your Postgres database
54+
- OpenTelemetry Collector that receives metrics from Postgres Exporter and forwards them to Last9.
55+
56+
### Understanding the Setup
57+
58+
#### Postgres Exporter
59+
60+
The Postgres Exporter connects to your Postgres database and exposes metrics in Prometheus format. It's configured to:
61+
- Connect to your database using the provided credentials
62+
- Use custom queries defined in queries.yaml
63+
- Expose metrics on port 9187
64+
65+
#### Custom Queries
66+
67+
The queries.yaml file defines custom metrics to collect from Postgres. The example includes a slow_queries metric that:
68+
- Identifies queries running longer than 1 minute
69+
- Collects detailed information about these queries including:
70+
- Process ID
71+
- Database name
72+
- Username
73+
- Query text
74+
- Execution time
75+
- Wait events
76+
- Blocking processes
77+
78+
#### OpenTelemetry Collector
79+
80+
The OpenTelemetry Collector is configured to:
81+
- Scrape metrics from Postgres Exporter every 60 seconds
82+
- Add resource attributes like database name and environment
83+
- Process metrics in batches
84+
- Export metrics to Last9 using OTLP protocol
85+
86+
### Verification
87+
88+
Verify the containers are running:
89+
90+
```bash
91+
docker ps
92+
```
93+
94+
Check Postgres Exporter metrics:
95+
96+
```bash
97+
curl http://localhost:9187/metrics
98+
```
99+
100+
Check OpenTelemetry Collector logs:
101+
102+
```bash
103+
docker logs otel-collector
104+
```
105+
106+
### Troubleshooting
107+
108+
1. Container issues:
109+
110+
```bash
111+
# Check container status
112+
docker ps -a
113+
114+
# View container logs
115+
docker logs postgres-exporter
116+
docker logs otel-collector
117+
```
118+
119+
2. Connection issues:
120+
121+
```bash
122+
docker logs postgres-exporter
123+
```
124+
3. OpenTelemetry Collector issues:
125+
126+
```bash
127+
# Check configuration
128+
docker exec otel-collector cat /etc/otel/collector/config.yaml
129+
130+
# Restart collector
131+
docker compose restart otel-collector
132+
```
133+
134+
### Extending the Configuration
135+
136+
#### Adding More Custom Queries
137+
138+
You can extend queries.yaml to monitor additional aspects of your Postgres database:
139+
- Connection metrics
140+
- Table statistics
141+
- Index usage
142+
- Buffer cache hit ratio
143+
- Replication lag
144+
145+
#### Monitoring Multiple Databases
146+
147+
To monitor multiple Postgres databases:
148+
- Create separate instances of Postgres Exporter in your docker-compose.yaml
149+
- Configure each with different database credentials
150+
- Update the OpenTelemetry Collector configuration to scrape metrics from all exporters
151+
152+
## Required Postgres Permissions
153+
154+
The Postgres Exporter needs specific permissions to access system catalog tables and views, especially for the custom queries defined in `queries.yaml`. For the `slow_queries` query, which accesses `pg_stat_activity`, you need to create a dedicated monitoring user with appropriate permissions:
155+
156+
```sql
157+
-- Create a dedicated user for monitoring
158+
CREATE USER postgres_exporter WITH PASSWORD 'your_secure_password';
159+
160+
-- Grant permissions required for monitoring
161+
GRANT pg_monitor TO postgres_exporter;
162+
163+
-- If using PostgreSQL version earlier than 10, you'll need these specific grants instead:
164+
-- GRANT SELECT ON pg_stat_activity TO postgres_exporter;
165+
-- GRANT SELECT ON pg_stat_replication TO postgres_exporter;
166+
-- GRANT SELECT ON pg_stat_database TO postgres_exporter;
167+
```
168+
169+
When configuring the Postgres Exporter in your `docker-compose.yaml`, make sure to use this dedicated monitoring user:
170+
171+
```yaml
172+
environment:
173+
- DATA_SOURCE_URI=<DB_HOST>/<DB_NAME>
174+
- DATA_SOURCE_USER=<DB_USER>
175+
- DATA_SOURCE_PASS=<DB_PASSWORD>
176+
```
177+
178+
### Additional Permissions for Custom Metrics
179+
180+
If you add more custom queries to `queries.yaml` that access other system tables or views, you may need to grant additional permissions. For example:
181+
182+
- For table statistics: `GRANT SELECT ON pg_statio_user_tables TO postgres_exporter;`
183+
- For index usage: `GRANT SELECT ON pg_stat_user_indexes TO postgres_exporter;`
184+
- For replication monitoring: `GRANT SELECT ON pg_stat_replication TO postgres_exporter;`
185+
186+
Always follow the principle of least privilege by granting only the permissions necessary for monitoring purposes.

postgres/docker-compose.yaml

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
services:
2+
postgres-exporter:
3+
image: quay.io/prometheuscommunity/postgres-exporter
4+
environment:
5+
- DATA_SOURCE_URI=<DB_HOST>/<DB_NAME>
6+
- DATA_SOURCE_USER=<DB_USER>
7+
- DATA_SOURCE_PASS=<DB_PASSWORD>
8+
volumes:
9+
- ./queries.yaml:/queries.yaml
10+
command: --extend.query-path="/queries.yaml"
11+
restart: unless-stopped
12+
ports:
13+
- "9187:9187"
14+
otel-collector:
15+
image: otel/opentelemetry-collector-contrib:0.118.0
16+
volumes:
17+
- ./otel-collector-config.yaml:/etc/otel/collector/config.yaml
18+
command: --config=/etc/otel/collector/config.yaml
19+
depends_on:
20+
- postgres-exporter

postgres/otel-collector-config.yaml

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
receivers:
2+
prometheus:
3+
config:
4+
scrape_configs:
5+
- job_name: postgres-exporter
6+
scrape_interval: 60s
7+
static_configs:
8+
- targets: ["postgres-exporter:9187"]
9+
10+
processors:
11+
batch:
12+
timeout: 10s
13+
send_batch_size: 10000
14+
resourcedetection:
15+
detectors: [env, system, docker, ec2, azure, gcp]
16+
timeout: 2s
17+
resource:
18+
attributes:
19+
- key: db_name
20+
value: postgres-db
21+
action: upsert
22+
- key: deployment.environment
23+
value: dev
24+
action: upsert
25+
26+
exporters:
27+
otlp/last9:
28+
endpoint: "<LAST9_OTLP_ENDPOINT>" # Replace with actual Last9 endpoint if different
29+
headers:
30+
"Authorization": "Basic <LAST9_OTLP_AUTH_HEADER>"
31+
32+
debug:
33+
verbosity: detailed
34+
35+
service:
36+
pipelines:
37+
metrics:
38+
receivers: [prometheus]
39+
processors: [resourcedetection, resource, batch]
40+
exporters: [otlp/last9]
41+
42+
telemetry:
43+
logs:
44+
level: info

postgres/queries.yaml

+58
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
slow_queries:
2+
query: |
3+
SELECT
4+
pid,
5+
datname AS database,
6+
usename AS user,
7+
application_name,
8+
client_addr,
9+
EXTRACT(EPOCH FROM (now() - query_start)) AS query_time_seconds,
10+
REGEXP_REPLACE(SUBSTRING(query, 1, 500), E'[\n\r]+', ' ', 'g') AS query,
11+
state,
12+
wait_event_type,
13+
wait_event,
14+
backend_type,
15+
pg_blocking_pids(pid) AS blocked_by
16+
FROM pg_stat_activity
17+
WHERE (now() - query_start) > interval '1 minute'
18+
AND state <> 'idle'
19+
AND query NOT ILIKE '%pg_stat%'
20+
AND query NOT ILIKE '%pg_catalog%'
21+
ORDER BY query_time_seconds DESC
22+
metrics:
23+
- pid:
24+
usage: "LABEL"
25+
description: "Process ID"
26+
- database:
27+
usage: "LABEL"
28+
description: "Database name"
29+
- user:
30+
usage: "LABEL"
31+
description: "Username"
32+
- application_name:
33+
usage: "LABEL"
34+
description: "Application name"
35+
- client_addr:
36+
usage: "LABEL"
37+
description: "Client address"
38+
- query_time_seconds:
39+
usage: "GAUGE"
40+
description: "Query execution time in seconds"
41+
- query:
42+
usage: "LABEL"
43+
description: "Query text (first 500 chars)"
44+
- state:
45+
usage: "LABEL"
46+
description: "Query state"
47+
- wait_event_type:
48+
usage: "LABEL"
49+
description: "Type of event the process is waiting for"
50+
- wait_event:
51+
usage: "LABEL"
52+
description: "Name of the event the process is waiting for"
53+
- backend_type:
54+
usage: "LABEL"
55+
description: "Type of backend"
56+
- blocked_by:
57+
usage: "LABEL"
58+
description: "PIDs of sessions blocking this query"

0 commit comments

Comments
 (0)