Skip to content

Commit

Permalink
updated logcli
Browse files Browse the repository at this point in the history
  • Loading branch information
Jayclifford345 committed Dec 11, 2024
1 parent 2135b0a commit aefe771
Show file tree
Hide file tree
Showing 5 changed files with 140 additions and 52 deletions.
96 changes: 70 additions & 26 deletions loki/logcli-tutorial/preprocessed.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,25 +60,25 @@ The logs are processed from Grafana Alloy to extract labels and structured metad

## Setup

To get started, we need to clone the [Alloy Scenario](https://github.com/grafana/alloy-scenarios) repository and spin up the mail-house example:
To get started, we need to clone the [Alloy Scenario](https://github.com/grafana/alloy-scenarios) repository and start the mail-house example:

1. Clone the repository:
```bash
git clone https://github.com/grafana/alloy-scenarios.git
```
1. Spin up the mail-house example:
1. Start the mail-house example:
```bash
docker compose -f alloy-scenarios/mail-house/docker-compose.yml up -d
```

This will start the mail-house example and expose the Loki instance on [`http://localhost:3100`](http://localhost:3100). We have also included a Grafana instance to verify the LogCLI results which can be accessed on [`http://localhost:3000`](http://localhost:3000).
This will start the mail-house example and expose the Loki instance at [`http://localhost:3100`](http://localhost:3100). We have also included a Grafana instance to verify the LogCLI results which can be accessed at [http://localhost:3000](http://localhost:3000).

### Connecting LogCLI to Loki

To connect LogCLI to the Loki instance, you need to set the `LOKI_ADDR` environment variable:

> **Tip:**
> If you are running this example against your own Loki instance and have configured authentication, you will need to set the `LOKI_USERNAME` and `LOKI_PASSWORD` environment variables as well.
> If you are running this example against your own Loki instance and have configured authentication, you will also need to set the `LOKI_USERNAME` and `LOKI_PASSWORD` environment variables.

```bash
export LOKI_ADDR=http://localhost:3100
Expand Down Expand Up @@ -111,7 +111,7 @@ As part of our role within the logistics company, we need to build a report on t

### Find all critical packages

To find all critical packages in the last hour, we can run the following query:
To find all critical packages in the last hour (default lookback time), we can run the following query:

```bash
logcli query '{service_name="Delivery World"} | package_status="critical"'
Expand Down Expand Up @@ -140,18 +140,60 @@ logcli query --since 24h --limit 100 '{service_name="Delivery World"} | package_

### Metric Queries

We can also use LogCLI to query logs based on metrics. For instance as part of the site report we want to count how many packages are being sent from California in the last 5 minutes. We can use the following query:
We can also use LogCLI to query logs based on metrics. For instance as part of the site report we want to count the total number of packages sent from California in the last 24 hours in 1 hour intervals. We can use the following query:

```bash
logcli query 'sum(count_over_time({state="California"}[5m]))'
logcli query --since 24h 'sum(count_over_time({state="California"}[1h]))'
```

Lets suppose we only want to know the number of packages of type `document` being sent from California in the last 5 minutes. We can use the following query:
This will return a JSON object containing a list of timestamps (Unix format) and the number of packages sent from California in 1 hour intervals. Since we summing the count of logs over time, we will see the total number of logs steadily increase over time. The output will look similar to the following:

```console
[
{
"metric": {},
"values": [
[
1733913765,
"46"
],
[
1733914110,
"114"
],
[
1733914455,
"179"
],
[
1733914800,
"250"
],
[
1733915145,
"318"
],
[
1733915490,
"392"
],
[
1733915835,
"396"
]
]
}
]
```

We can take this a step further and filter the logs based on the `package_type` label. For instance, we can count the number of documents sent from California in the last 24 hours in 1 hour intervals:

```bash
logcli query 'count_over_time({state="California"}| json | package_type= "Documents" [5m])'
logcli query --since 24h 'sum(count_over_time({state="California"}| json | package_type= "Documents" [1h]))'
```

This will return a similar JSON object above but will only show a trend of the number of documents sent from California in 1 hour intervals.

### Instant Metric Queries

Instant metric queries are a subset of metric queries that return the value of the metric at a specific point in time. This can be useful for quickly understanding an aggregate state of the stored logs.
Expand All @@ -173,12 +215,12 @@ This will return a result similar to the following:
"58"
]
}
]%
]
```

### Writing query results to a file

Another useful feature of LogCLI is the ability to write the query results to a file. This can be useful for offloading the results of our inventory report:
Another useful feature of LogCLI is the ability to write the query results to a file. This can be useful for downloading the results of our inventory report:

First we need to create a directory to store the logs:
```bash
Expand Down Expand Up @@ -254,7 +296,7 @@ service_name 1 15
### Detected Fields
Another useful feature of LogCLI is the ability to detect fields in your logs. This can be useful for understanding the structure of your logs and the keys that are present. This will allow us to detect keys which could be promoted to labels and others to structured metadata.
Another useful feature of LogCLI is the ability to detect fields in your logs. This can be useful for understanding the structure of your logs and the keys that are present. This will let us detect keys which could be promoted to labels or to structured metadata.
```bash
logcli detected-fields --since 24h '{service_name="Delivery World"}'
Expand All @@ -263,17 +305,19 @@ logcli detected-fields --since 24h '{service_name="Delivery World"}'
This will return a list of all the keys detected in our logs. The output will look similar to the following:
```console
label: city type: string cardinality: 15
label: detected_level type: string cardinality: 3
label: note type: string cardinality: 7
label: package_id type: string cardinality: 7136
label: package_status type: string cardinality: 4
label: package_type type: string cardinality: 5
label: receiver_address type: string cardinality: 6962
label: receiver_name type: string cardinality: 100
label: sender_address type: string cardinality: 6981
label: sender_name type: string cardinality: 100
label: timestamp type: string cardinality: 7438
label: city type: string cardinality: 10
label: detected_level type: string cardinality: 3
label: note type: string cardinality: 7
label: package_id type: string cardinality: 20
label: package_size_extracted type: string cardinality: 3
label: package_status type: string cardinality: 4
label: package_type type: string cardinality: 5
label: receiver_address type: string cardinality: 20
label: receiver_name type: string cardinality: 19
label: sender_address type: string cardinality: 20
label: sender_name type: string cardinality: 19
label: state_extracted type: string cardinality: 5
label: timestamp type: string cardinality: 20
```
You can now see why we opted to keep `package_id` in structured metadata and `package_size` as a label. Package ID has a high cardinality and is unique to each log entry, making it a good candidate for structured metadata since we potentially may need to query for it directly. Package size, on the other hand, has a low cardinality and is a good candidate for a label.
Expand Down Expand Up @@ -371,7 +415,7 @@ This will provide a similar JSON object but will aggregate the log volume into b
## Queries against static log files
In addition to querying logs from Loki, LogCLI also supports querying static log files. This can be useful for querying logs that are not stored in Loki using LogQL. Earlier in the tutorial we stored the logs in the `./inventory` directory. Lets run a similar query but pipe it into a log file:
In addition to querying logs from Loki, LogCLI also supports querying static log files. This can be useful for querying logs that are not stored in Loki. Earlier in the tutorial we stored the logs in the `./inventory` directory. Lets run a similar query but pipe it into a log file:
```bash
logcli query \
Expand All @@ -391,9 +435,9 @@ Next lets run a query against the static log file:
cat ./inventory/complete.log | logcli --stdin query '{service_name="Delivery World"} | json | package_status="critical"'
```
Note that since we are querying a static log file labels are not automatically detected:
Note that since we are querying a static log file, labels are not automatically detected:
* `{service_name="Delivery World"}` is optional in this case but is recommended for clarity.
* `json` is required to parse the log file as JSON. This allows us to extract the `package_status` field.
* `json` is required to parse the log file as JSON. This lets us extract the `package_status` field.
For example, suppose we try to query the log file without the `json` filter:
Expand Down
8 changes: 4 additions & 4 deletions loki/logcli-tutorial/step1.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
# Setup

To get started, we need to clone the [Alloy Scenario](https://github.com/grafana/alloy-scenarios) repository and spin up the mail-house example:
To get started, we need to clone the [Alloy Scenario](https://github.com/grafana/alloy-scenarios) repository and start the mail-house example:

1. Clone the repository:
```bash
git clone https://github.com/grafana/alloy-scenarios.git
```{{exec}}
1. Spin up the mail-house example:
1. Start the mail-house example:
```bash
docker compose -f alloy-scenarios/mail-house/docker-compose.yml up -d
```{{exec}}
This will start the mail-house example and expose the Loki instance on [`http://localhost:3100`{{copy}}]({{TRAFFIC_HOST1_3100}}). We have also included a Grafana instance to verify the LogCLI results which can be accessed on [`http://localhost:3000`{{copy}}]({{TRAFFIC_HOST1_3000}}).
This will start the mail-house example and expose the Loki instance at [`http://localhost:3100`{{copy}}]({{TRAFFIC_HOST1_3100}}). We have also included a Grafana instance to verify the LogCLI results which can be accessed at [http://localhost:3000]({{TRAFFIC_HOST1_3000}}).
## Connecting LogCLI to Loki
To connect LogCLI to the Loki instance, you need to set the `LOKI_ADDR`{{copy}} environment variable:
> **Tip:**
> If you are running this example against your own Loki instance and have configured authentication, you will need to set the `LOKI_USERNAME`{{copy}} and `LOKI_PASSWORD`{{copy}} environment variables as well.
> If you are running this example against your own Loki instance and have configured authentication, you will also need to set the `LOKI_USERNAME`{{copy}} and `LOKI_PASSWORD`{{copy}} environment variables.
```bash
export LOKI_ADDR=http://localhost:3100
Expand Down
56 changes: 49 additions & 7 deletions loki/logcli-tutorial/step2.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ As part of our role within the logistics company, we need to build a report on t

## Find all critical packages

To find all critical packages in the last hour, we can run the following query:
To find all critical packages in the last hour (default lookback time), we can run the following query:

```bash
logcli query '{service_name="Delivery World"} | package_status="critical"'
Expand Down Expand Up @@ -33,18 +33,60 @@ logcli query --since 24h --limit 100 '{service_name="Delivery World"} | package_
## Metric Queries
We can also use LogCLI to query logs based on metrics. For instance as part of the site report we want to count how many packages are being sent from California in the last 5 minutes. We can use the following query:
We can also use LogCLI to query logs based on metrics. For instance as part of the site report we want to count the total number of packages sent from California in the last 24 hours in 1 hour intervals. We can use the following query:
```bash
logcli query 'sum(count_over_time({state="California"}[5m]))'
logcli query --since 24h 'sum(count_over_time({state="California"}[1h]))'
```{{exec}}
Lets suppose we only want to know the number of packages of type `document`{{copy}} being sent from California in the last 5 minutes. We can use the following query:
This will return a JSON object containing a list of timestamps (Unix format) and the number of packages sent from California in 1 hour intervals. Since we summing the count of logs over time, we will see the total number of logs steadily increase over time. The output will look similar to the following:
```console
[
{
"metric": {},
"values": [
[
1733913765,
"46"
],
[
1733914110,
"114"
],
[
1733914455,
"179"
],
[
1733914800,
"250"
],
[
1733915145,
"318"
],
[
1733915490,
"392"
],
[
1733915835,
"396"
]
]
}
]
```{{copy}}
We can take this a step further and filter the logs based on the `package_type`{{copy}} label. For instance, we can count the number of documents sent from California in the last 24 hours in 1 hour intervals:
```bash
logcli query 'count_over_time({state="California"}| json | package_type= "Documents" [5m])'
logcli query --since 24h 'sum(count_over_time({state="California"}| json | package_type= "Documents" [1h]))'
```{{exec}}
This will return a similar JSON object above but will only show a trend of the number of documents sent from California in 1 hour intervals.
## Instant Metric Queries
Instant metric queries are a subset of metric queries that return the value of the metric at a specific point in time. This can be useful for quickly understanding an aggregate state of the stored logs.
Expand All @@ -66,12 +108,12 @@ This will return a result similar to the following:
"58"
]
}
]%
]
```{{copy}}
## Writing query results to a file
Another useful feature of LogCLI is the ability to write the query results to a file. This can be useful for offloading the results of our inventory report:
Another useful feature of LogCLI is the ability to write the query results to a file. This can be useful for downloading the results of our inventory report:
First we need to create a directory to store the logs:
Expand Down
26 changes: 14 additions & 12 deletions loki/logcli-tutorial/step3.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ service_name 1 15
## Detected Fields
Another useful feature of LogCLI is the ability to detect fields in your logs. This can be useful for understanding the structure of your logs and the keys that are present. This will allow us to detect keys which could be promoted to labels and others to structured metadata.
Another useful feature of LogCLI is the ability to detect fields in your logs. This can be useful for understanding the structure of your logs and the keys that are present. This will let us detect keys which could be promoted to labels or to structured metadata.
```bash
logcli detected-fields --since 24h '{service_name="Delivery World"}'
Expand All @@ -58,17 +58,19 @@ logcli detected-fields --since 24h '{service_name="Delivery World"}'
This will return a list of all the keys detected in our logs. The output will look similar to the following:
```console
label: city type: string cardinality: 15
label: detected_level type: string cardinality: 3
label: note type: string cardinality: 7
label: package_id type: string cardinality: 7136
label: package_status type: string cardinality: 4
label: package_type type: string cardinality: 5
label: receiver_address type: string cardinality: 6962
label: receiver_name type: string cardinality: 100
label: sender_address type: string cardinality: 6981
label: sender_name type: string cardinality: 100
label: timestamp type: string cardinality: 7438
label: city type: string cardinality: 10
label: detected_level type: string cardinality: 3
label: note type: string cardinality: 7
label: package_id type: string cardinality: 20
label: package_size_extracted type: string cardinality: 3
label: package_status type: string cardinality: 4
label: package_type type: string cardinality: 5
label: receiver_address type: string cardinality: 20
label: receiver_name type: string cardinality: 19
label: sender_address type: string cardinality: 20
label: sender_name type: string cardinality: 19
label: state_extracted type: string cardinality: 5
label: timestamp type: string cardinality: 20
```{{copy}}
You can now see why we opted to keep `package_id`{{copy}} in structured metadata and `package_size`{{copy}} as a label. Package ID has a high cardinality and is unique to each log entry, making it a good candidate for structured metadata since we potentially may need to query for it directly. Package size, on the other hand, has a low cardinality and is a good candidate for a label.
Expand Down
6 changes: 3 additions & 3 deletions loki/logcli-tutorial/step4.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Queries against static log files

In addition to querying logs from Loki, LogCLI also supports querying static log files. This can be useful for querying logs that are not stored in Loki using LogQL. Earlier in the tutorial we stored the logs in the `./inventory`{{copy}} directory. Lets run a similar query but pipe it into a log file:
In addition to querying logs from Loki, LogCLI also supports querying static log files. This can be useful for querying logs that are not stored in Loki. Earlier in the tutorial we stored the logs in the `./inventory`{{copy}} directory. Lets run a similar query but pipe it into a log file:

```bash
logcli query \
Expand All @@ -20,11 +20,11 @@ Next lets run a query against the static log file:
cat ./inventory/complete.log | logcli --stdin query '{service_name="Delivery World"} | json | package_status="critical"'
```{{exec}}
Note that since we are querying a static log file labels are not automatically detected:
Note that since we are querying a static log file, labels are not automatically detected:
- `{service_name="Delivery World"}`{{copy}} is optional in this case but is recommended for clarity.
- `json`{{copy}} is required to parse the log file as JSON. This allows us to extract the `package_status`{{copy}} field.
- `json`{{copy}} is required to parse the log file as JSON. This lets us extract the `package_status`{{copy}} field.
For example, suppose we try to query the log file without the `json`{{copy}} filter:
Expand Down

0 comments on commit aefe771

Please sign in to comment.