-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cloudwatch): set threshold metric id for anomaly detection band alarms #10545
feat(cloudwatch): set threshold metric id for anomaly detection band alarms #10545
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rrhodes thanks for submitting this!
See my comments in the code. This seems more like a feature than a fix? If that's the case can you add a README section? It will be easier to discuss the API
@@ -162,6 +168,9 @@ export class Alarm extends AlarmBase { | |||
extendedStatistic: renderIfExtendedStatistic(props.statistic), | |||
}); | |||
} | |||
if (ANOMALY_DETECTION_COMPARISON_OPERATORS.includes(comparisonOperator) && !props.thresholdMetricId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this break existing customers using ant of the comparison operators in ANOMALY_DETECTION_COMPARISON_OPERATORS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following issue #10540, my understanding is that customers could not apply any of these anomaly detection comparison operators without a threshold metric ID. This leads me to believe it would not break any existing customer functionality? If I've misunderstood the problem I'll be happy to revisit this.
* | ||
* @default - Not configured. | ||
*/ | ||
readonly thresholdMetricId?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat under modeled, can this be Metric
? Which gets me to wonder if this is the right API. How will this look with metric.CreateAlarm()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could explore a ‘thresholdMetric’ property instead of type Metric which under the hood would extract the ID and assign it to ‘thresholdMetricId’ in the CloudFormation construct, if that would be preferred?
I thought it would be best to keep this similar to the CloudFormation implementation, which just expects the ID.
I've amended metric.createAlarm()
to pull the metric threshold ID from CreateAlarmOptions
, so it would look like
createAlarm(this, 'Alarm', {
...,
thresholdMetricId: 'ad1',
})
@NetaNir, I see the argument both ways for feature vs fix. We're adding a new property, which resolves a bug. I chose to reference this as a fix since it was motivated by issue #10540. If you'd still prefer me to change this to a feature branch then let me know and I'll be happy to change that. |
@rrhodes There are a lot of places we change the CloudFormation implementation in favor of a more ergonomic API, this is especially true for the ThresholdMetricId should not be set unless list of Metrics is also set If Metrics list must contain exactly one metric matching the ThresholdMetricId parameter Since the CDK does not emits
Let me know what you think. Happy to flush out the design together. |
Thanks for the feedback, @NetaNir. I'll revisit this tomorrow and come back to you with a proposal - happy to work on this together to reach the best solution. |
@NetaNir, thanks again for your feedback about My current understanding is that each metric for I'd like to hear your thoughts on this approach before I proceed any further. Would you like me to define a new integration test to cover any of this logic or are you happy with the additional unit tests on their own? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were you able to deploy this change? Please add an integ test so we can make sure the implementation works.
If you can add a code sample to the README section we can discuss the API there
@@ -158,6 +158,11 @@ The most important properties to set while creating an Alarms are: | |||
- `evaluationPeriods`: how many consecutive periods the metric has to be | |||
breaching the the threshold for the alarm to trigger. | |||
|
|||
If you create an alarm for a `MathExpression` metric which applies the CloudWatch | |||
[anomaly detection band](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Anomaly_Detection.html) | |||
function, a threshold metric ID will be automatically assigned to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is implementation details and should no be in the README
@@ -163,6 +169,10 @@ export class Alarm extends AlarmBase { | |||
}); | |||
} | |||
|
|||
if (metricProps.thresholdMetricId && !ANOMALY_DETECTION_COMPARISON_OPERATORS.includes(comparisonOperator)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this still here? We should not expose thresholdMetricId
as it can be used only on very specific conditions
Period: 300, | ||
Stat: 'Average', | ||
}, | ||
ReturnData: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail on deployment. Anomaly detection based alarm requires the all metric in Metrics
to have ReturnData
set to true.
test.done(); | ||
}, | ||
|
||
'alarm for math expression with anomaly detection function cannot use default comparison operator'(test: Test) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to not allow it in the API.
label: conf.renderingProperties?.label, | ||
returnData: entry.tag ? undefined : false, // Tag stores "primary" attribute, default is "true" | ||
}; | ||
}, | ||
withExpression(expr, conf) { | ||
return { | ||
expression: expr.expression, | ||
id: entry.id || uniqueMetricId(), | ||
id: entry.id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not removed, but relocated above so I could apply this in anomalyDetectionMetricIds
Thanks for the feedback, @NetaNir. I will revisit this later today with a Readme sample to discuss further. |
[anomaly detection](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Anomaly_Detection.html): | ||
|
||
```ts | ||
const invocationAnomalies = new MathExpression({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NetaNir This is how I envisioned the new feature would work with MathExpression
. This new feature appears to be considerably more work than I anticipated when I first picked it up. If it's a priority for you and / or your colleagues, I'm happy to hand this over, otherwise with your advice I can proceed. Let me know your thoughts before I make any further developments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest version of the API looks ok. Since this feature is non trivial, please add an integration test so we can verify the deployment succeed.
0318c98
to
43bed7a
Compare
43bed7a
to
2339520
Compare
Sorry to pester. Is there any ETA on when this PR will be merged? Would be awesome to see this in master! |
Hi @NetaNir, any further feedback regarding this PR? |
You didn't te request a review, so it didn't show up in my todo. I'll have a look next week |
Apologies, I thought re-opening this from draft to "ready for review" would deliver a notification. Thanks in advance for next week's review, much appreciated. |
Bump 😉 |
Hi @NetaNir, any additional feedback on this PR? Will investigate build failure tomorrow. |
…-anomaly-detection-alarms
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@@ -140,7 +140,7 @@ export class Alarm extends AlarmBase { | |||
/** | |||
* This metric as an annotation | |||
*/ | |||
private readonly annotation: HorizontalAnnotation; | |||
private readonly annotation?: HorizontalAnnotation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this made optional?
*/ | ||
readonly threshold: number; | ||
readonly threshold?: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that anomaly detection based alarms does not allow to set threshold
and requires setting thresholdMetricId
instead.
However, making this property optional sacrifice the common use case ergonomics, in favor of the anomaly detection use case. Making this property optional means that we will need to add a synth time check that verifies: If this is an anomaly detection based alarm and threshold is set then throw an error . While it can be done, it makes the developer experience not so great.
Since there are several such subtleties in the anomaly detection alarm (e.g the allowed operators, annotation, returnData
), it probably makes more sense to implement a createAnomalyDetectionAlarm
.
I agree that this feature seemed trivial initially but it has many pitfalls, as such it will require more design work. Before we do another revision lets discuss the API in details, feel free to comment on this PR or create a new issue
Adding support for the ThresholdMetricId CloudWatch alarm property. This is required for alarms based on metrics applying the CloudWatch
ANOMALY_DETECTION_BAND
function.Fixes #10540
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license