Skip to content

Commit

Permalink
Restore tenant configuration for Loki forwarder.
Browse files Browse the repository at this point in the history
This configuration will not be required for our default Loki instance,
but may be required for custom Loki instances.
  • Loading branch information
alanconway committed Jul 22, 2021
1 parent a5f4ca8 commit 63d9314
Showing 1 changed file with 43 additions and 23 deletions.
66 changes: 43 additions & 23 deletions enhancements/cluster-logging/forward-to-loki.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ superseded-by:
- [X] Enhancement is `implementable`
- [X] Design details are appropriately documented from clear requirements
- [X] Test plan is defined
- [ ] Operational readiness criteria is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [X] Operational readiness criteria is defined
- [X] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary
Expand Down Expand Up @@ -78,9 +78,10 @@ To summarize the key points:
- Minimize the _total number_ of labels per stream
- There is a limit of 30 max, fewer labels == better performance.
- Log streams must be _ordered by collection timestamp_.
- Must not allow distinct nodes with separate clocks to contribute to a single stream.
Logs collected from separate sources (log files, containers, nodes) must not go to the same Loki stream.\
**Note**: Grafana claim they will remove this restriction, but for now we need to live with it.

All logging meta-data will still be included in log records.
All log record data will still be available as a JSON object in the Loki log payload.
Labels do not need to _identify the source of logs_, only to _partition the search space_.
At query time labels reduce the search space, then Loki uses log content to complete the search.

Expand All @@ -105,26 +106,30 @@ The user can configure a different set for specific needs.

The default label set is:

`log_type`: One of `application`, `infrastructure`, `audit`. Basic log categorization.

`cluster`: The openshift cluster name as printed by: \
`oc get infrastructure/cluster -o template="{{.status.infrastructureName}}"` \
NOTE: If the cluster name is not available, no cluster label is applied. This is not an error.
`log_type`: One of `application`, `infrastructure` or `audit`.

`kubernetes.namespace_name`: namespace where the log originated.

`kubernetes.pod_name`: name of the pod where the log originated.

`kubernetes.host`: Host name of the cluster node where the log record originated.\
This is *always included* to guarantee ordered streams, even if the user configures a label set without it.
`kubernetes.container_name`: name of the container where the log originated.

`kubernetes.host`: Host name of the cluster node where the log record originated.

`_tag_`: added by the logging system to ensure sequential time-stamps per Loki stream.
Users should ignore this label: its contents may change or it may be removed.

**Note:** Labels are used to partition the search space for Loki query selectors.
The complete log meta-data is available in the JSON payload, and is available
for Loki filters.

**Note:** `container_name` and `image_name` are *not* included in the defaults. They are not high-cardinality by themselves, but they are multipliers of `namespace/pod`, which is the highest cardinality we want to allow by default.
### User selected labels and timestamp ordering

**Note:** We use names rather than UID for cluster, namespace and pod.
Names are more likely to be known and easier to use in queries.
Using both names *and* UIDs is redundant, they partition the data in approximately the same way.
The user can select an alternative label set.

UIDs (and all other meta-data) are still available in the log payload for filtering in queries.
The forwarder may add additional labels to ensure sequential time-stamps.
Users should only rely on the labels they have selected explicitly, the
labels added for sequencing may change in future.

### Proposed API

Expand All @@ -133,6 +138,7 @@ Add a new `loki` output type to the `ClusterLogForwarder` API:
``` yaml
- name: myLokiOutput
type: loki

url: ...
secret: ...
```
Expand All @@ -148,6 +154,12 @@ The following optional output fields are Loki-specific:
Example: `kubernetes.labels.foo` => `kubernetes_labels_foo`.\
**Note**: `kubernetes.host` is *always* be included, even if not requested.
It is required to ensure ordered label streams.
- `tenantKey`: (string, optional) \
Use the value of this meta-data key as the Loki tenant ID.\
At least these keys are supported:
- `kubernetes.namespaceName`: Use the namespace name as the tenant ID.
- `kubernetes.labels.<key>`: Use the string value of kubernetes label with key `<key>`.
- `openshift.labels.<key>`: use the value of a label attached by the forwarder.

The full set of meta-data keys is listed in [data model][].

Expand All @@ -160,11 +172,24 @@ Notes on plug-in configuration:
- Security: configured by the `output.secret` as usual.
- K8s labels as Loki labels: supported by the plug-in.
- Always include `kubernetes.host` to avoid avoid out-of-order streams.
- Tenant set from `output.tenantKey`
- Output format is `json`, serializes the fluentd record like other outputs.
- Static labels set as `extra_labels` to avoid extracting from each record.

### User Stories

#### Treat each namespace as a separate Loki tenant

I want logs from each namespace to be directed to separate Loki tenants.

``` yaml
- name: myLokiOutput
type: loki
url: ...
secret: ...
tenantKey: kubernetes.namespace_name
```

#### Query all logs from a namespace

``` logql
Expand Down Expand Up @@ -243,16 +268,11 @@ For example a CI test cluster that constantly creates and destroys randomly-name

### Upgrade / Downgrade Strategy

Need to provide migration assistance for users coming from Elasticsearch.
Out of scope for this proposal.

### Version Skew Strategy
None.

## Implementation History
## Drawbacks

## Alternatives

None.

[label names]: https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels
[data model]: https://github.com/openshift/origin-aggregated-logging/blob/master/docs/com.redhat.viaq-openshift-project.asciidoc

0 comments on commit 63d9314

Please sign in to comment.