This repository has been archived by the owner on Dec 29, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
246 additions
and
82 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Sizes | ||
|
||
## Row Size | ||
|
||
Samples were taken from an 8 node, 90 pod cluster being monitored by a single Prometheus replica, for roughly an | ||
8-hour period of low overnight usage. Approximately 2.13 million samples were collected per hourly chunk, or 18.13 | ||
million total. | ||
|
||
Sample rows weighed an average of `262.55` bytes after chunking and indexing, but without compression. The `lid` | ||
column is a consistent 29 bytes, while the `name` varies between 38 and 69 bytes. | ||
|
||
These samples had 32793 unique label sets (`lid`s), of which 54.71% were still active by the end. Label rows weighed | ||
an average of `2686.95` bytes after indexing (BTREE for the `lid`, GIN for the `labels`), or 84MB total. | ||
|
||
Removing labels from the samples table and deduplicating them yielded a 59.75% reduction in total disk space. | ||
|
||
Enabling compression on the `metric_samples` table for chunks older than 6 hours, using `orderby = 'time'` and | ||
`segmentby = 'lid'`, brought each chunk from 530MB to below 10MB: | ||
|
||
```sql | ||
SELECT | ||
hypertable_name, | ||
chunk_name, | ||
compressed_total_bytes, | ||
uncompressed_total_bytes | ||
FROM timescaledb_information.compressed_chunk_stats | ||
WHERE compression_status = 'Compressed'; | ||
|
||
hypertable_name | chunk_name | compressed_total_bytes | uncompressed_total_bytes | ||
-----------------+-----------------------------------------+------------------------+-------------------------- | ||
metric_samples | _timescaledb_internal._hyper_1_54_chunk | 9680 kB | 530 MB | ||
metric_samples | _timescaledb_internal._hyper_1_57_chunk | 9712 kB | 531 MB | ||
metric_samples | _timescaledb_internal._hyper_1_58_chunk | 9632 kB | 530 MB | ||
metric_samples | _timescaledb_internal._hyper_1_59_chunk | 9744 kB | 529 MB | ||
metric_samples | _timescaledb_internal._hyper_1_60_chunk | 9616 kB | 530 MB | ||
(5 rows) | ||
|
||
SELECT | ||
hypertable_name, | ||
chunk_name, | ||
pg_size_bytes(uncompressed_total_bytes)::float / pg_size_bytes(compressed_total_bytes) AS compression_ratio | ||
FROM timescaledb_information.compressed_chunk_stats | ||
WHERE compression_status = 'Compressed'; | ||
|
||
hypertable_name | chunk_name | compression_ratio | ||
-----------------+-----------------------------------------+------------------- | ||
metric_samples | _timescaledb_internal._hyper_1_54_chunk | 0.98216391509434 | ||
metric_samples | _timescaledb_internal._hyper_1_57_chunk | 0.982138653483992 | ||
metric_samples | _timescaledb_internal._hyper_1_58_chunk | 0.982252358490566 | ||
metric_samples | _timescaledb_internal._hyper_1_59_chunk | 0.982012051039698 | ||
metric_samples | _timescaledb_internal._hyper_1_60_chunk | 0.982281839622641 | ||
(5 rows) | ||
``` | ||
|
||
Compression yielded a 98.21% reduction in total disk space. The `GROUP BY lid ORDER BY time DESC` causes each | ||
timeseries, or unique set of labels, to be grouped together and time order often allows delta compression. | ||
For short enough chunks, this results in a small number of samples: less than 2000 per timeseries | ||
in the test cluster. The compressed chunk usually had one row per active timeseries, indicating that most or | ||
all samples could be compressed together (the `lid` and thus the `name` are constant and can be stored once). | ||
|
||
When the last 6 hours are considered live for a 24 hour retention period for full resolution samples, the final | ||
reduction in disk space was 84.14% compared to the previous schema. This becomes more effective as the ratio of | ||
active to compressed chunks decreases, with a 94.00% reduction for a 72 hour retention period and upwards of 97% | ||
for a 10 day retention period. | ||
|
||
## Further Research | ||
|
||
- Removing the `name` from `metric_samples` would reduce row size further and make each row an identical 45 bytes | ||
(before indexing), with `lid` replacing it in the `(time, name, time)` index. This would, however, require the | ||
query planner to hit `metric_labels` first, then `metric_samples` using the `(lid, time)` within each chunk | ||
(Timescale itself plans the first `time` to a set of chunks). | ||
|
||
- Putting `metric_labels` into a hypertable would allow Timescale to drop or compress old labels automatically, | ||
but each insert could bump a row from an old chunk into a new one, which seems like a performance problem and | ||
would make it difficult to compress chunks. Most `lid`s remain inactive when they have not been used in a few | ||
minutes, when their pod or job completes, but there are no guarantees around that. | ||
|
||
- Using a TTL cache rather than LRU (or 2Q) would allow the adapter to drop old label sets after some time. The | ||
maximum size provided to the LRU cache prevents labels from accumulating without limit over time, but does | ||
nothing to evict labels that have not been seen recently. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,20 @@ | ||
module github.com/ssube/prometheus-sql-adapter | ||
|
||
go 1.13 | ||
|
||
require ( | ||
github.com/go-kit/kit v0.9.0 | ||
github.com/gogo/protobuf v1.3.1 | ||
github.com/golang/snappy v0.0.1 | ||
github.com/grpc-ecosystem/grpc-gateway v1.12.1 // indirect | ||
github.com/hashicorp/golang-lru v0.5.3 | ||
github.com/lib/pq v1.2.0 | ||
github.com/pkg/errors v0.8.1 | ||
github.com/prometheus/client_golang v1.2.1 | ||
github.com/prometheus/common v0.7.0 | ||
github.com/prometheus/prometheus v2.5.0+incompatible | ||
github.com/robfig/cron/v3 v3.0.0 | ||
google.golang.org/genproto v0.0.0-20191115221424-83cc0476cb11 // indirect | ||
google.golang.org/grpc v1.25.1 // indirect | ||
gopkg.in/alecthomas/kingpin.v2 v2.2.6 | ||
) |
Oops, something went wrong.