Skip to content

Commit

Permalink
Restore tenant configuration for Loki forwarder.
Browse files Browse the repository at this point in the history
This configuration will not be required for our default Loki instance,
but may be required for custom Loki instances.
  • Loading branch information
alanconway committed Jul 30, 2021
1 parent a5f4ca8 commit 7f74862
Showing 1 changed file with 42 additions and 33 deletions.
75 changes: 42 additions & 33 deletions enhancements/cluster-logging/forward-to-loki.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ superseded-by:
- [X] Enhancement is `implementable`
- [X] Design details are appropriately documented from clear requirements
- [X] Test plan is defined
- [ ] Operational readiness criteria is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [X] Operational readiness criteria is defined
- [X] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary
Expand Down Expand Up @@ -78,9 +78,10 @@ To summarize the key points:
- Minimize the _total number_ of labels per stream
- There is a limit of 30 max, fewer labels == better performance.
- Log streams must be _ordered by collection timestamp_.
- Must not allow distinct nodes with separate clocks to contribute to a single stream.
- Must not combine independent logs streams that may have out-of-order clocks.\
**Note**: Grafana claim they will remove this restriction in future, but this design can cope with it.

All logging meta-data will still be included in log records.
All log record data will still be available as a JSON object in the Loki log payload.
Labels do not need to _identify the source of logs_, only to _partition the search space_.
At query time labels reduce the search space, then Loki uses log content to complete the search.

Expand All @@ -105,26 +106,18 @@ The user can configure a different set for specific needs.

The default label set is:

`log_type`: One of `application`, `infrastructure`, `audit`. Basic log categorization.
* `log_type`: Category of log: `application`, `infrastructure` or `audit`.
* `kubernetes.namespace_name`: namespace where the log originated.
* `kubernetes.pod_name`: name of the pod where the log originated.
* `kubernetes.pod_id`: globally unique pod identifier.
* `kubernetes.container_name`: name of the container within the pod.

`cluster`: The openshift cluster name as printed by: \
`oc get infrastructure/cluster -o template="{{.status.infrastructureName}}"` \
NOTE: If the cluster name is not available, no cluster label is applied. This is not an error.
**Note:** The labels `kubernetes.pod_id` and `kubernetes.container_name` are required
to separate log streams that may have inconsistent time-stamps. These labels
are always added to a user-defined label set.

`kubernetes.namespace_name`: namespace where the log originated.

`kubernetes.pod_name`: name of the pod where the log originated.

`kubernetes.host`: Host name of the cluster node where the log record originated.\
This is *always included* to guarantee ordered streams, even if the user configures a label set without it.

**Note:** `container_name` and `image_name` are *not* included in the defaults. They are not high-cardinality by themselves, but they are multipliers of `namespace/pod`, which is the highest cardinality we want to allow by default.

**Note:** We use names rather than UID for cluster, namespace and pod.
Names are more likely to be known and easier to use in queries.
Using both names *and* UIDs is redundant, they partition the data in approximately the same way.

UIDs (and all other meta-data) are still available in the log payload for filtering in queries.
**Note:** Labels are used to partition the search space.
The full log meta-data is available in the JSON payload for filtering.

### Proposed API

Expand All @@ -133,6 +126,7 @@ Add a new `loki` output type to the `ClusterLogForwarder` API:
``` yaml
- name: myLokiOutput
type: loki

url: ...
secret: ...
```
Expand All @@ -141,13 +135,20 @@ Add a new `loki` output type to the `ClusterLogForwarder` API:
The `secret` may also contain `username` and `password` fields for Loki.

The following optional output fields are Loki-specific:

- `labelKeys`: ([]string, default=_see [Default Loki Labels](#default-loki-labels))_ \
* `tenentKey: (string, optional) \
Tenet name (also known as org-id) to add to loki requests. See [Loki Multi-Tenancy](https://grafana.com/docs/loki/latest/operations/multi-tenancy/)
* `labelKeys`: ([]string, default=_see [Default Loki Labels](#default-loki-labels))_ \
A list of meta-data keys to replace the default labels.\
Keys are translated to [label names][] as described in [Summary of Loki Labels](#summary-of-loki-labels)
Example: `kubernetes.labels.foo` => `kubernetes_labels_foo`.\
**Note**: `kubernetes.host` is *always* be included, even if not requested.
It is required to ensure ordered label streams.
At least these keys are supported:
- `kubernetes.namespaceName`: Use the namespace name as the tenant ID.
- `kubernetes.labels.<key>`: Use the string value of kubernetes label with key `<key>`.
- `openshift.labels.<key>`: use the value of a label attached by the forwarder.

**Note:** The labels `kubernetes.pod_id` and `kubernetes.container_name` are required
to separate log streams that may have inconsistent time-stamps. These labels
are always added to a user-defined label set.

The full set of meta-data keys is listed in [data model][].

Expand All @@ -160,11 +161,24 @@ Notes on plug-in configuration:
- Security: configured by the `output.secret` as usual.
- K8s labels as Loki labels: supported by the plug-in.
- Always include `kubernetes.host` to avoid avoid out-of-order streams.
- Tenant set from `output.tenantKey`
- Output format is `json`, serializes the fluentd record like other outputs.
- Static labels set as `extra_labels` to avoid extracting from each record.

### User Stories

#### Treat each namespace as a separate Loki tenant

I want logs from each namespace to be directed to separate Loki tenants.

``` yaml
- name: myLokiOutput
type: loki
url: ...
secret: ...
tenantKey: kubernetes.namespace_name
```

#### Query all logs from a namespace

``` logql
Expand Down Expand Up @@ -243,16 +257,11 @@ For example a CI test cluster that constantly creates and destroys randomly-name

### Upgrade / Downgrade Strategy

Need to provide migration assistance for users coming from Elasticsearch.
Out of scope for this proposal.

### Version Skew Strategy
None.

## Implementation History
## Drawbacks

## Alternatives

None.

[label names]: https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
[data model]: https://github.com/openshift/origin-aggregated-logging/blob/master/docs/com.redhat.viaq-openshift-project.asciidoc

0 comments on commit 7f74862

Please sign in to comment.