Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support alerts based on Log analytics queries #3951

Closed
ivanthelad opened this issue Jul 29, 2019 · 24 comments · Fixed by #5053
Closed

Support alerts based on Log analytics queries #3951

ivanthelad opened this issue Jul 29, 2019 · 24 comments · Fixed by #5053

Comments

@ivanthelad
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Create Alerts based on Log analytics queries. as documented here https://docs.microsoft.com/en-us/azure/azure-monitor/platform/alerts-log
The corresponding API
https://docs.microsoft.com/en-us/rest/api/monitor/scheduledqueryrules/createorupdate
Currently only possible to create alerts based on Azure monitor metrics.
https://www.terraform.io/docs/providers/azurerm/r/monitor_metric_alert.html

New or Affected Resource(s)

  • azurerm_monitor

Potential Terraform Configuration

@ivanthelad ivanthelad changed the title Create Alerts based on Log analytics queries Support alerts based on Log analytics queries Jul 29, 2019
@sunnynazar
Copy link

sunnynazar commented Aug 29, 2019

This is really needed as Azure does not expose all metrics as default .

This is possible by using log analytics workspace with customised queries and create alert.

But terraform does n't support creating alerts based on log anayltics queries.

Please help to prioritise this enhancement in Terraform.

@mcdafydd
Copy link
Contributor

I've just started working on this. Not sure yet how long it would take to get a PR together. Maybe a couple weeks. Anyone else already taking this on?

@ichwill100
Copy link

Is there any update on this?

@batizar
Copy link

batizar commented Nov 21, 2019

@mcdafydd any updates on this? thanks.

@mcdafydd
Copy link
Contributor

Hi all,

Two months flies by. Sorry about that. I had started working on this and put it down for a bit while #4638 was getting merged. That's done and I'm ready to get back into this now. I'll commit what I have into my fork as soon as I can and try to get the PR created. I'll aim for the end of the week.

@mcdafydd
Copy link
Contributor

It's not completed yet, but I've made solid progress in my fork. Data source and doc is almost done. Still need to finish the resource and tests. I may not have a PR today, but should have it submitted before Thursday.

@mcdafydd
Copy link
Contributor

mcdafydd commented Nov 28, 2019

Edit: Removed reference to early working branch. Updated examples below to match the PR submission.

I could use some feedback on the approach. Right now, I've got everything under a single new resource called azurerm_monitor_scheduled_query_rules, matching the API endpoint. The resource definition is getting kinda long as it needs to support two different styles of actions with different required and optional parameters. I also decided to flatten the action, schedule, and source blocks.

I'm wondering if it would be better to break these out into two resources called azurerm_monitor_alerting_action and azurerm_monitor_log_to_metric_action.

An AlertingAction looks like this right now:

resource "azurerm_scheduled_query_rule" "example" {
  name                   = format("%s-queryrule", var.prefix)
  location               = azurerm_resource_group.example.location
  resource_group_name    = azurerm_resource_group.example.name

  action_type              = "Alerting"
  azns_action {
    action_group           = []
    email_subject          = "Email Header"
    custom_webhook_payload = "{}"
  }
  data_source_id           = azurerm_application_insights.example.id
  description              = "Scheduled query rule Alerting Action example"
  enabled                  = true
  frequency                = 5
  query                    = "requests | where status_code >= 500 | summarize AggregatedValue = count() by bin(timestamp, 5m)"
  query_type               = "ResultCount"
  severity                 = 1
  time_window              = 30
  trigger {
    threshold_operator     = "GreaterThan"
    threshold              = 3
    metric_trigger {
      operator            = "GreaterThan"
      threshold           = 1
      metric_trigger_type = "Total"
      metric_column       = "timestamp"
    }
  }
}

and a LogToMetricAction plan looks like this right now:

resource "azurerm_scheduled_query_rule" "example3" {
  name                   = format("%s-queryrule3", var.prefix)
  location               = azurerm_resource_group.example.location
  resource_group_name    = azurerm_resource_group.example.name

  action_type            = "LogToMetricAction"
  criteria               = [{
      metric_name        = "Average_% Idle Time"
      dimensions         = [{
        name             = "dimension"
        operator         = "GreaterThan"
        values           = ["latency"]
      }]
  }]
  data_source_id         = azurerm_application_insights.example.id
  description            = "Scheduled query rule LogToMetric example"
  enabled                = true
}

Are there some common guidelines I could follow for new resources like this?

Thanks!

@sunnynazar
Copy link

@mcdafydd - will this be released in 1.39.0 ? Thanks.

@mcdafydd
Copy link
Contributor

🤞 I hope so! That's not my decision to make. I will keep checking the pull request and make sure it stays mergeable and issue-free.

@sunnynazar
Copy link

Still waiting for this to be released, any chances for this to be added to 1.40.0 release?

@antempus
Copy link

antempus commented Jan 6, 2020

Also waiting for this, any updates?

@SIGAN
Copy link

SIGAN commented Jan 14, 2020

@mcdafydd is there any way to push for review your pull request? I'm waiting on this.

@mcdafydd
Copy link
Contributor

This is only my second PR to terraform, so I wouldn't expect I had a lot of pull on the matter. I'm with ya though, excited to be able to create both alert actions and monitoring queries in code.

@mcdafydd
Copy link
Contributor

@tombuildsstuff @katbyte do you think 1.42.0 will likely be a successful version to review the PR for this feature? If there's anything I can do in the meantime, I'm of course willing to help. The only slight concern I have is that if any significant refactoring is required for #5053 before it can be approved, I could definitely use some extra lead time to make sure I can get any requests completed in time for release.

Thanks for any help!

@stazz
Copy link

stazz commented Jan 24, 2020

I wonder why this keeps being constantly pushed to later versions. Especially since the PR has all checks passed and no conflicts. I guess resourcing issue? :/

The reason why I am commenting on that log-analytics-query-based alerts and metrics are one of the few things that our current TF automation can't handle. So I am waiting on this keenly. :)

@mcdafydd
Copy link
Contributor

Thanks @katbyte for doing an initial review! I should be able to submit updates for all the issues by the end of the week.

@DanielFrei64
Copy link

Looking forward to getting this feature. Converting my resources one by one to Terraform and was on to working alerting today but alas, scheduled query rules do not exist. Thank you mcdafydd for all your work on this.

@mcdafydd
Copy link
Contributor

We're getting close now, @DanielFrei64. Now that we have complete Action Groups, after Scheduled Query Rules I think a good next step would be Action Rules.

@ecottd
Copy link

ecottd commented Feb 19, 2020

Also looking forward to getting this feature as well as the action rules to tie everything together. Thanks for working on this.

@johnnyhuy
Copy link
Contributor

2.0.0 just released, will this be included soon?

@katbyte katbyte added this to the v2.1.0 milestone Mar 4, 2020
@ghost
Copy link

ghost commented Mar 11, 2020

This has been released in version 2.1.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.1.0"
}
# ... other configuration ...

@batmanme
Copy link

Thanks for implementing this , Do we have any document reference for configuration?

@ghost
Copy link

ghost commented Apr 4, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet