Skip to content

Latest commit

 

History

History
478 lines (347 loc) · 21.7 KB

README.md

File metadata and controls

478 lines (347 loc) · 21.7 KB

Lesson 6: Introduction to Filters

Goals

In this lesson we will discuss using filters in the observability pipeline. In the hands-on exercises you will use the built-in filters, then create and apply a custom filter. This lesson is intended for operators of Sensu, and assumes you have set up a local workshop environment.

Filters and Handlers

Sensu filters provide control over which events get processed by downstream handlers. The filter applies conditionals to an event stream in realtime using Sensu Query Expressions (SQEs).

By default, a Sensu handler will process all events sent to it. This is rarely desired behavior, so most handlers will have event filters applied to limit which events it processes.

Sensu Query Expressions (SQEs)

Filters are written using simple JavaScript, known as Sensu Query Expressions (SQEs). SQEs are EMCAScript 5 expressions that return either true or false.

SQEs can be as simple as basic comparison operations – "less than" (<) or "greater than" (>) "equal to" (==) or "not equal" (!=) – or as complex as small JavaScript programs. You can even package filter logic as JavaScript libraries and import them into the sandbox environment using Dynamic Runtime Assets!

  • Eliminating alert fatigue by deduplicating incoming events and limiting repeat processing to predefined conditions (e.g. only alert once per hour per incident)
  • Optimizing metrics processing by dropping events that do not contain metric data, or sampling metrics to reduce storage costs
  • Orchestrating event processing via occurrence filtering (e.g. trigger a lightweight remediation action after 3 occurrences, and a more aggressive remediation action after 10+ occurrences)
  • Configuring conditional triggers by evaluating incoming events to determine which event handler to use (e.g. notify developers via Mattermost, but send all incidents assigned to operations via Pagerduty using a handler set and corresponding filters)

Built-in Filters and Helper Functions

Sensu includes built-in event filters and helper functions to customize event pipelines for metrics and alerts.

Built-In Filters

  • is_incident: only process warnings ("status": 1), critical ("status": 2), other (unknown or custom status), and resolution events.
  • not_silenced: prevents processing of events that include the silenced attribute.
  • has_metrics: only process events containing Sensu Metrics.

Helper Functions

  • hour(): a custom SQE function that returns the hour of a UNIX epoch timestamp in UTC and 24-hour time notation (e.g. hour(event.timestamp) >= 17)
  • weekday(): a custom SQE function that returns a number that represents the day of the week of a UNIX epoch timestamp (Sunday is 0; e.g. weekday(event.timestamp) == 0)

EXERCISE 1: Use a Built-in Filter to Only Alert on Problems

Scenario

You want to reduce the amount of alerts that are going to your chat-ops channel. You'd like to only get ones that indicate there's some kind of problem or possible incident.

Solution

To accomplish this, we'll put a filter in front of the Mattermost handler. We will use the built-in filter is_incident on the mattermost handler. This filter will only let events be processed by the handler if they have a non-zero exit status.

Steps

Let's use a built-in filter with a handler we configured in Lesson 4.

  1. Modify a handler configuration template to use a built-in filter.

    Let's modify the handler template we created in Lesson 4. Replace the contents of mattermost.yaml with the following:

    ---
    type: Handler
    api_version: core/v2
    metadata:
      name: mattermost
    spec:
      type: pipe
      command: >-
        sensu-slack-handler
        --channel "#alerts"
        --username SensuGo
        --description-template "{{ .Check.Output }}\n\n[namespace:{{.Entity.Namespace}}]"
        --webhook-url ${MATTERMOST_WEBHOOK_URL}
      runtime_assets:
      - sensu/sensu-slack-handler:1.4.0
      timeout: 10
      filters:
      - is_incident
      secrets:
      - name: MATTERMOST_WEBHOOK_URL
        secret: mattermost_webhook_url

    Understanding the YAML:

    • We replaced the filters: [] line with the following:
      filters:
      - is_incident
  2. Update the handler using sensuctl create -f.

    sensuctl create -f mattermost.yaml

    Now verify that the handler configuration was updated by viewing the handler info.

    sensuctl handler info mattermost --format yaml
  3. Configure environment variables.

    Setup the necessary environment variables by running one of the following commands:

    Mac and Linux users (.envrc):

    source .envrc
    env | grep SENSU

    Windows users (.envrc.ps1):

    . .\.envrc.ps1
    Get-ChildItem env: | Out-String -Stream | Select-String -Pattern SENSU

    The output should include the expected values for SENSU_API_URL, SENSU_NAMESPACE, and SENSU_API_KEY.

    NOTE: if you need help creating an API Key, please refer to the Lesson 3 EXERCISE 6: create an API Key for personal use.

  4. Test the filter.

    The is_incident filter will prevent processing of healthy ("status": 0) events, unless they are resolving an incident. Let's send some events to see this behavior in action.

    The following event will be filtered:

    Mac and Linux:

    curl -i -X POST -H "Authorization: Key ${SENSU_API_KEY}" \
         -H "Content-Type: application/json" \
         -d '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-app"},"interval":30,"status":0,"output":"200 OK","handlers":["mattermost"]}}' \
         "${SENSU_API_URL:-http://127.0.0.1:8080}/api/core/v2/namespaces/${SENSU_NAMESPACE:-default}/events"

    Windows (PowerShell):

    Invoke-RestMethod `
      -Method POST `
      -Headers @{"Authorization" = "Key ${Env:SENSU_API_KEY}";} `
      -ContentType "application/json" `
      -Body '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-app"},"interval":30,"status":0,"output":"200 OK","handlers":["mattermost"]}}' `
      -Uri "${Env:SENSU_API_URL}/api/core/v2/namespaces/${Env:SENSU_NAMESPACE}/events"

    The following event will be processed:

    Mac and Linux:

    curl -i -X POST -H "Authorization: Key ${SENSU_API_KEY}" \
         -H "Content-Type: application/json" \
         -d '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-app"},"interval":30,"status":2,"output":"ERROR: failed to connect to database.","handlers":["mattermost"]}}' \
         "${SENSU_API_URL:-http://127.0.0.1:8080}/api/core/v2/namespaces/${SENSU_NAMESPACE:-default}/events"

    Windows (PowerShell):

    Invoke-RestMethod `
      -Method POST `
      -Headers @{"Authorization" = "Key ${Env:SENSU_API_KEY}";} `
      -ContentType "application/json" `
      -Body '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-app"},"interval":30,"status":2,"output":"ERROR: failed to connect to database.","handlers":["mattermost"]}}' `
      -Uri "${Env:SENSU_API_URL}/api/core/v2/namespaces/${Env:SENSU_NAMESPACE}/events"

    Try running these commands multiple times in different combinations and observing the behavior in your local Mattermost instance.

    The first occurrence of a "status": 0 event following an active incident is treated as a "resolution" event, and will be processed; but subsequent occurrences of the "status": 0 event will be filtered.

    Every occurrence of the "status": 1 event will be processed, but we wouldn't typically want that to happen (because "alert fatigue"). Let's move on to the next exercise to learn how to modify that behavior.

NEXT: If you have applied the built-in is_incident filter and observed it working as described above, then you're ready to move on to the next exercise.

EXERCISE 2: Create a Custom Filter to Prevent Repeated Alerts

Scenario

After applying the built-in is_incident feature, you now notice that during incidents you get repeated error messages in chat. You want to reduce the alert fatigue so that you only get one error messages when the incident starts, then get another when it's over.

Solution

To accomplish this we will write a custom filter using JavaScript. Internally, Sensu maintains a counter on events which tracks how many times the event has been triggered. We can use that in our filter to let only the first instance of the event through to the handler.

Steps

  1. Configure a filter to reduce alert fatigue.

    The backend maintains a series of event counters that are effective for managing alert frequency. These counters include the occurrences counter, and the occurrences_watermark counter. The occurrences property is visible in the event detail output from a sensuctl event info command:

    Mac and Linux

    sensuctl event info i-424242 my-app --format json | grep occurrences

    Windows (PowerShell)

    sensuctl event info i-424242 my-app --format json | Select-String "occurrences"

    Example Output:

    "occurrences": 3,
    "occurrences_watermark": 3,

    Let's create a filter that only processes the first occurrence of an incident, and then again only once every hour.

    Copy the following contents to a file named filter-repeated.yaml:

    ---
    type: EventFilter
    api_version: core/v2
    metadata:
      name: filter-repeated
    spec:
      action: allow
      expressions:
      - event.check.occurrences == 1 || event.check.occurrences % (3600 / event.check.interval) == 0

    NOTE: for more information on this filter expression – specifically including the modulo operator (%) or "remainder" calculation – please visit the sensu/catalog project on GitHub.

  2. Create the filter-repeated filter using sensuctl.

    sensuctl create -f filter-repeated.yaml

    Then verify that the filter was successfully created:

    sensuctl filter list

    Example Output:

              Name         Action                                            Expressions
     ───────────────── ──────── ────────────────────────────────────────────────────────────────────────────────────────────────
      filter-repeated   allow    (event.check.occurrences == 1 || event.check.occurrences % (3600 / event.check.interval) == 0)
    

    Our custom filter-repeated filter is now available to use with handlers!

NEXT: If you see your filter-repeated filter, you're ready to move on to the next exercise.

EXERCISE 3: Using a Custom Filter in a Handler

Scenario

You just created a custom filter and now you want to update your chat handler to use it.

Solution

Handlers can have multiple filters stacked in order. Combining the built-in is_incident filter with the custom filter-repeated filter we just made, will result in only the first failure event showing up in chat. To add this, we will edit our handler configuration to add filter-repeated to the filters property.

Steps

  1. Modify the Mattermost handler configuration to use a custom filter.

    Let's modify the handler template we created in Lesson 4. Replace the contents of mattermost.yaml with the following:

    ---
    type: Handler
    api_version: core/v2
    metadata:
      name: mattermost
    spec:
      type: pipe
      command: >-
        sensu-slack-handler
        --channel "#alerts"
        --username SensuGo
        --description-template "{{ .Check.Output }}\n\n[namespace:{{.Entity.Namespace}}]"
        --webhook-url ${MATTERMOST_WEBHOOK_URL}
      runtime_assets:
      - sensu/sensu-slack-handler:1.4.0
      timeout: 10
      filters:
      - is_incident
      - filter-repeated
      secrets:
      - name: MATTERMOST_WEBHOOK_URL
        secret: mattermost_webhook_url

    Understanding the YAML:

    • We added filter-repeated the filters: array.
  2. Update the handler using sensuctl create -f.

    sensuctl create -f mattermost.yaml

    Now verify that the handler configuration was updated by viewing the handler configration using sensuctl handler info

    sensuctl handler info mattermost --format yaml
  3. Test the filter.

    The filter-repeated filter will prevent repeat processing of events (only allowing repeat processing once per hour). Let's send some events to see this behavior in action.

    The following event will be processed (the first occurrence of a critical severity event):

    Mac and Linux:

    curl -i -X POST -H "Authorization: Key ${SENSU_API_KEY}" \
         -H "Content-Type: application/json" \
         -d '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-api"},"interval":30,"status":2,"output":"ERROR: failed to connect to database.","handlers":["mattermost"]}}' \
         "${SENSU_API_URL:-http://127.0.0.1:8080}/api/core/v2/namespaces/${SENSU_NAMESPACE:-default}/events"

    Windows (PowerShell):

    Invoke-RestMethod `
      -Method POST `
      -Headers @{"Authorization" = "Key ${Env:SENSU_API_KEY}";} `
      -ContentType "application/json" `
      -Body '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-api"},"interval":30,"status":2,"output":"ERROR: failed to connect to database.","handlers":["mattermost"]}}' `
      -Uri "${Env:SENSU_API_URL}/api/core/v2/namespaces/${Env:SENSU_NAMESPACE}/events"

    The following event will be filtered (the second occurrence of a critical severity event):

    Mac and Linux:

    curl -i -X POST -H "Authorization: Key ${SENSU_API_KEY}" \
         -H "Content-Type: application/json" \
         -d '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-api"},"interval":30,"status":2,"output":"ERROR: failed to connect to database.","handlers":["mattermost"]}}' \
         "${SENSU_API_URL:-http://127.0.0.1:8080}/api/core/v2/namespaces/${SENSU_NAMESPACE:-default}/events"

    Windows (PowerShell):

    Invoke-RestMethod `
      -Method POST `
      -Headers @{"Authorization" = "Key ${Env:SENSU_API_KEY}";} `
      -ContentType "application/json" `
      -Body '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-api"},"interval":30,"status":2,"output":"ERROR: failed to connect to database.","handlers":["mattermost"]}}' `
      -Uri "${Env:SENSU_API_URL}/api/core/v2/namespaces/${Env:SENSU_NAMESPACE}/events"

    The following event will be processed (the first occurrence of a recovery event):

    Mac and Linux:

    curl -i -X POST -H "Authorization: Key ${SENSU_API_KEY}" \
         -H "Content-Type: application/json" \
         -d '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-api"},"interval":30,"status":0,"output":"200 OK","handlers":["mattermost"]}}' \
         "${SENSU_API_URL:-http://127.0.0.1:8080}/api/core/v2/namespaces/${SENSU_NAMESPACE:-default}/events"

    Windows (PowerShell):

    Invoke-RestMethod `
      -Method POST `
      -Headers @{"Authorization" = "Key ${Env:SENSU_API_KEY}";} `
      -ContentType "application/json" `
      -Body '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-api"},"interval":30,"status":0,"output":"200 OK","handlers":["mattermost"]}}' `
      -Uri "${Env:SENSU_API_URL}/api/core/v2/namespaces/${Env:SENSU_NAMESPACE}/events"

    The following event will be filtered (a repeat occurrence of a healthy event):

    Mac and Linux:

    curl -i -X POST -H "Authorization: Key ${SENSU_API_KEY}" \
         -H "Content-Type: application/json" \
         -d '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-api"},"interval":30,"status":0,"output":"200 OK","handlers":["mattermost"]}}' \
         "${SENSU_API_URL:-http://127.0.0.1:8080}/api/core/v2/namespaces/${SENSU_NAMESPACE:-default}/events"

    Windows (PowerShell):

    Invoke-RestMethod `
      -Method POST `
      -Headers @{"Authorization" = "Key ${Env:SENSU_API_KEY}";} `
      -ContentType "application/json" `
      -Body '{"entity":{"metadata":{"name":"i-424242"}},"check":{"metadata":{"name":"my-api"},"interval":30,"status":0,"output":"200 OK","handlers":["mattermost"]}}' `
      -Uri "${Env:SENSU_API_URL}/api/core/v2/namespaces/${Env:SENSU_NAMESPACE}/events"

    Try running these commands multiple times in different combinations and observing the behavior. The is_incident and an occurrence-based filter like filter-repeated work very well together for reducing alert fatigue.

NEXT: if you have successfully applied your filter and observed it working as described above, then you're ready to move on to the next lesson!

Discussion

In this lesson we learned how to apply filters to control the behavior of handlers. We also learned how to solve complex problems by authoring custom filters using JavaScript expressions.

These examples demonstrate Sensu's flexible filtering system, which allows you to customize how and when events will be processed by the Sensu pipeline.

Use Cases

Event filters provide a real-time detection and analysis engine for the Sensu observability pipeline.

Some example use cases include:

  • Reduce alert fatigue by deduplicating incoming events and limiting repeat processing (e.g. only alert once per hour per incident)
  • Optimize metrics processing by dropping empty events, or sampling metrics to reduce storage costs
  • Orchestrate remediations via occurrence filtering (e.g. trigger a lightweight remediation action after 3 occurrences, and a more aggressive remediation action after 10+ occurrences)
  • Configure conditional triggers by determining which event handler to use (e.g. notify developers via Mattermost, but send all incidents assigned to operations to Pagerduty)

Filter Execution Environment

The expressions are executed in a sandboxed EMCAScript 5 compatible JavaScript virtual machine called Otto.

Learn More

Next Steps

Share your feedback on Lesson 06

Lesson 7: Introduction to Agents & Entities