Skip to content

Commit

Permalink
docs: APM updates for 7.14 (#104232)
Browse files Browse the repository at this point in the history
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
  • Loading branch information
bmorelli25 and smith committed Jul 12, 2021
1 parent e88910a commit 9ab26cf
Show file tree
Hide file tree
Showing 21 changed files with 61 additions and 72 deletions.
1 change: 1 addition & 0 deletions docs/apm/agent-configuration.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ Supported configurations are also tagged with the image:./images/dynamic-config.

[horizontal]
Go Agent:: {apm-go-ref}/configuration.html[Configuration reference]
iOS agent:: _Not yet supported_
Java Agent:: {apm-java-ref}/configuration.html[Configuration reference]
.NET Agent:: {apm-dotnet-ref}/configuration.html[Configuration reference]
Node.js Agent:: {apm-node-ref}/configuration.html[Configuration reference]
Expand Down
115 changes: 50 additions & 65 deletions docs/apm/apm-alerts.asciidoc
Original file line number Diff line number Diff line change
@@ -1,69 +1,57 @@
[role="xpack"]
[[apm-alerts]]
=== Alerts
=== Alerts and rules

++++
<titleabbrev>Create an alert</titleabbrev>
++++

The APM app allows you to define **rules** to detect complex conditions within your APM data
and trigger built-in **actions** when those conditions are met.

The APM app integrates with Kibana's {kibana-ref}/alerting-getting-started.html[alerting and actions] feature.
It provides a set of built-in **actions** and APM specific threshold **alerts** for you to use
and enables central management of all alerts from <<management,Kibana Management>>.
The following **rules** are supported:

* Latency anomaly rule:
Alert when latency of a service is abnormal
* Transaction error rate threshold rule:
Alert when the service's transaction error rate is above the defined threshold
* Error count threshold rule:
Alert when the number of errors in a service exceeds a defined threshold

[role="screenshot"]
image::apm/images/apm-alert.png[Create an alert in the APM app]

For a walkthrough of the alert flyout panel, including detailed information on each configurable property,
see Kibana's <<create-edit-rules,defining alerts>>.

The APM app supports four different types of alerts:

* Transaction duration anomaly:
alerts when the service's transaction duration reaches a certain anomaly score
* Transaction duration threshold:
alerts when the service's transaction duration exceeds a given time limit over a given time frame
* Transaction error rate threshold:
alerts when the service's transaction error rate is above the selected rate over a given time frame
* Error count threshold:
alerts when service exceeds a selected number of errors over a given time frame
For a complete walkthrough of the **Create rule** flyout panel, including detailed information on each configurable property,
see Kibana's <<create-edit-rules,create and edit rules>>.

Below, we'll walk through the creation of two of these alerts.
Below, we'll walk through the creation of two APM rules.

[float]
[[apm-create-transaction-alert]]
=== Example: create a transaction duration alert
=== Example: create a latency anomaly rule

Transaction duration alerts trigger when the duration of a specific transaction type in a service exceeds a defined threshold.
This guide will create an alert for the `opbeans-java` service based on the following criteria:
Latency anomaly rules trigger when the latency of a service is abnormal.
This guide will create an alert for all services based on the following criteria:

* Environment: Production
* Transaction type: `transaction.type:request`
* Average request is above `1500ms` for the last 5 minutes
* Check every 10 minutes, and repeat the alert every 30 minutes
* Send the alert via Slack
* Environment: production
* Severity level: critical
* Run every five minutes
* Send an alert to a Slack channel only when the rule status changes

From the APM app, navigate to the `opbeans-java` service and select
**Alerts** > **Create threshold alert** > **Transaction duration**.
From any page in the APM app, select **Alerts and rules** > **Latency** > **Create anomaly rule**.
Change the name of the alert, but do not edit the tags.

`Transaction duration | opbeans-java` is automatically set as the name of the alert,
and `apm` and `service.name:opbeans-java` are added as tags.
It's fine to change the name of the alert, but do not edit the tags.
Based on the criteria above, define the following rule details:

Based on the alert criteria, define the following alert details:
* **Check every** - `5 minutes`
* **Notify** - "Only on status change"
* **Environment** - `all`
* **Has anomaly with severity** - `critical`

* **Check every** - `10 minutes`
* **Notify every** - `30 minutes`
* **TYPE** - `request`
* **WHEN** - `avg`
* **IS ABOVE** - `1500ms`
* **FOR THE LAST** - `5 minutes`

Select an action type.
Multiple action types can be selected, but in this example, we want to post to a Slack channel.
Next, add a connector. Multiple connectors can be selected, but in this example we're interested in Slack.
Select **Slack** > **Create a connector**.
Enter a name for the connector,
and paste the webhook URL.
and paste your Slack webhook URL.
See Slack's webhook documentation if you need to create one.

A default message is provided as a starting point for your alert.
Expand All @@ -72,35 +60,32 @@ to pass additional alert values at the time a condition is detected to an action
A list of available variables can be accessed by selecting the
**add variable** button image:apm/images/add-variable.png[add variable button].

Select **Save**. The alert has been created and is now active!
Click **Save**. The rule has been created and is now active!

[float]
[[apm-create-error-alert]]
=== Example: create an error rate alert
=== Example: create an error count threshold alert

Error rate alerts trigger when the number of errors in a service exceeds a defined threshold.
This guide creates an alert for the `opbeans-python` service based on the following criteria:
The error count threshold alert triggers when the number of errors in a service exceeds a defined threshold.
This guide will create an alert for all services based on the following criteria:

* Environment: Production
* All environments
* Error rate is above 25 for the last minute
* Check every 1 minute, and repeat the alert every 10 minutes
* Send the alert via email to the `opbeans-python` team

From the APM app, navigate to the `opbeans-python` service and select
**Alerts** > **Create threshold alert** > **Error rate**.
* Check every 1 minute, and alert every time the rule is active
* Send the alert via email to the site reliability team

`Error rate | opbeans-python` is automatically set as the name of the alert,
and `apm` and `service.name:opbeans-python` are added as tags.
It's fine to change the name of the alert, but do not edit the tags.
From any page in the APM app, select **Alerts and rules** > **Error count** > **Create threshold rule**.
Change the name of the alert, but do not edit the tags.

Based on the alert criteria, define the following alert details:
Based on the criteria above, define the following rule details:

* **Check every** - `1 minute`
* **Notify every** - `10 minutes`
* **IS ABOVE** - `25 errors`
* **FOR THE LAST** - `1 minute`
* **Notify** - "Every time alert is active"
* **Environment** - `all`
* **Is above** - `25 errors`
* **For the last** - `1 minute`

Select the **Email** action type and click **Create a connector**.
Select the **Email** connector and click **Create a connector**.
Fill out the required details: sender, host, port, etc., and click **save**.

A default message is provided as a starting point for your alert.
Expand All @@ -109,14 +94,14 @@ to pass additional alert values at the time a condition is detected to an action
A list of available variables can be accessed by selecting the
**add variable** button image:apm/images/add-variable.png[add variable button].

Select **Save**. The alert has been created and is now active!
Click **Save**. The alert has been created and is now active!

[float]
[[apm-alert-manage]]
=== Manage alerts and actions
=== Manage alerts and rules

From the APM app, select **Alerts** > **View active alerts** to be taken to the Kibana alerts and actions management page.
From this page, you can create, edit, disable, mute, and delete alerts, and create, edit, and disable connectors.
From the APM app, select **Alerts and rules** > **Manage rules** to be taken to the Kibana **Rules and Connectors** page.
From this page, you can disable, mute, and delete APM alerts.

[float]
[[apm-alert-more-info]]
Expand All @@ -126,4 +111,4 @@ See {kibana-ref}/alerting-getting-started.html[alerting and actions] for more in

NOTE: If you are using an **on-premise** Elastic Stack deployment with security,
communication between Elasticsearch and Kibana must have TLS configured.
More information is in the alerting {kibana-ref}/alerting-setup.html#alerting-prerequisites[prerequisites].
More information is in the alerting {kibana-ref}/alerting-setup.html#alerting-prerequisites[prerequisites].
1 change: 1 addition & 0 deletions docs/apm/filters.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ It's vital to be consistent when naming environments in your agents.
To learn how to configure service environments, see the specific agent documentation:

* *Go:* {apm-go-ref}/configuration.html#config-environment[`ELASTIC_APM_ENVIRONMENT`]
* *iOS agent:* _Not yet supported_
* *Java:* {apm-java-ref}/config-core.html#config-environment[`environment`]
* *.NET:* {apm-dotnet-ref}/config-core.html#config-environment[`Environment`]
* *Node.js:* {apm-node-ref}/configuration.html#environment[`environment`]
Expand Down
Binary file modified docs/apm/images/apm-agent-configuration.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-alert.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-error-group.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-logs-tab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-services-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-settings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-span-detail.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-traces.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-transaction-duration-dist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-transaction-response-dist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-transaction-sample.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/apm-transactions-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/service-maps-java.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/apm/images/service-maps.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/apm/service-maps.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ Service maps are supported for the following Agent versions:

[horizontal]
Go agent:: ≥ v1.7.0
iOS agent:: _Not yet supported_
Java agent:: ≥ v1.13.0
.NET agent:: ≥ v1.3.0
Node.js agent:: ≥ v3.6.0
Expand Down
12 changes: 6 additions & 6 deletions docs/apm/transactions.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -100,22 +100,22 @@ the selected transaction group.
image::apm/images/apm-transaction-response-dist.png[Example view of response time distribution]

[[transaction-duration-distribution]]
==== Transactions duration distribution
==== Latency distribution

This chart plots all transaction durations for the given time period.
A plot of all transaction durations for the given time period.
The screenshot below shows a typical distribution,
and indicates most of our requests were served quickly -- awesome!
It's the requests on the right, the ones taking longer than average, that we probably want to focus on.
It's the requests on the right, the ones taking longer than average, that we probably need to focus on.

[role="screenshot"]
image::apm/images/apm-transaction-duration-dist.png[Example view of transactions duration distribution graph]
image::apm/images/apm-transaction-duration-dist.png[Example view of latency distribution graph]

Select a transaction duration _bucket_ to display up to ten trace samples.
Select a latency duration _bucket_ to display up to ten trace samples.

[[transaction-trace-sample]]
==== Trace sample

Trace samples are based on the _bucket_ selection in the *Transactions duration distribution* chart;
Trace samples are based on the _bucket_ selection in the *Latency distribution* chart;
update the samples by selecting a new _bucket_.
The number of requests per bucket is displayed when hovering over the graph,
and the selected bucket is highlighted to stand out.
Expand Down
1 change: 1 addition & 0 deletions docs/apm/troubleshooting.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ don't forget to check our other troubleshooting guides or discussion forum:
* {apm-server-ref}/troubleshooting.html[APM Server troubleshooting]
* {apm-dotnet-ref}/troubleshooting.html[.NET agent troubleshooting]
* {apm-go-ref}/troubleshooting.html[Go agent troubleshooting]
* {apm-ios-ref}/troubleshooting.html[iOS agent troubleshooting]
* {apm-java-ref}/trouble-shooting.html[Java agent troubleshooting]
* {apm-node-ref}/troubleshooting.html[Node.js agent troubleshooting]
* {apm-php-ref}/troubleshooting.html[PHP agent troubleshooting]
Expand Down
2 changes: 1 addition & 1 deletion docs/settings/apm-settings.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ It is enabled by default.
// Any changes made in this file will be seen there as well.
// tag::apm-indices-settings[]

Index defaults can be changed in Kibana. Open the main menu, then click *APM > Settings > Indices*.
Index defaults can be changed in the APM app. Select **Settings** > **Indices**.
Index settings in the APM app take precedence over those set in `kibana.yml`.

[role="screenshot"]
Expand Down

0 comments on commit 9ab26cf

Please sign in to comment.