-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make alert messages more human-readable #75
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The webhook alert should contain a short helpful message explaining why an error is caused by the destination setup. In other snowplow loaders we get the message simply by serializing the Exception. But in Lake Loader I found the exception messages to be very messy. In a related problem, for Hudi setup errors I needed to traverse the Exception's `getCause` in order to check if it was a setup error. This PR takes more explicit control of setting short friendly error messages, and traversing the `getCause` to get all relevant messages. E.g. an alert message before this change: > Failed to create events table: s3a://<REDACTED/events/_delta_log: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by V1ToV2AwsCredentialProviderAdapter : software.amazon.awssdk.services.sts.model.StsException: User: arn:aws:iam::<REDACTED>:user/<REDACTED> is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::<REDACTED>:role/<REDACTED> (Service: Sts, Status Code: 403, Request ID: 00000000-0000-0000-0000-000000000000) The corresponding alert after this change: > Failed to create events table: s3a://<REDACTED/events/_delta_log: Failed to initialize AWS access credentials: Missing permissions to assume the AWS IAM role **Other small changes I snuck into this commit:** - Added specific webhook alerts for Hudi. - Removed the AssumedRoleCredentialsProvider for aws sdk v1. This is no longer needed now that Hadoop is fully using aws sdk v2. - Fixed minor bug with retrying creating a database in Hudi Writer
istreeter
force-pushed
the
cleaner-alert-message
branch
from
August 5, 2024 09:45
b7f9565
to
9b57410
Compare
oguzhanunlu
approved these changes
Aug 5, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 thanks @istreeter !
zhaow-de
added a commit
to alloy-ch/rcplus-alloy-snowplow-lake-loader
that referenced
this pull request
Oct 4, 2024
…patch-for-alloy * commit '7ab2edc3fd4d81ffb4d5f3285d02330def7672b1': Upgrade common-streams to 0.8.0-M5 Delete files asynchronously (snowplow-incubator#82) Upgrade common-streams 0.8.0-M4 (snowplow-incubator#81) Avoid error on duplicate view name (snowplow-incubator#80) Add option to exit on missing Iglu schemas (snowplow-incubator#79) common-streams 0.8.x with refactored health monitoring (snowplow-incubator#78) Create table concurrently with subscribing to stream of events (snowplow-incubator#77) Iceberg fail fast if missing permissions on the catalog (snowplow-incubator#76) Make alert messages more human-readable (snowplow-incubator#75) Hudi loader should fail early if missing permissions on Glue catalog (snowplow-incubator#72) Add alert & retry for delta/s3 initialization (snowplow-incubator#74) Implement alerting and retrying mechanisms Bump aws-hudi to 1.0.0-beta2 (snowplow-incubator#71) Bump hudi to 0.15.0 (snowplow-incubator#70) Allow disregarding Iglu field's nullability when creating output columns (snowplow-incubator#66) Extend health probe to report unhealthy on more error scenarios (snowplow-incubator#69) Fix bad rows resizing (snowplow-incubator#68)
oguzhanunlu
pushed a commit
that referenced
this pull request
Nov 1, 2024
The webhook alert should contain a short helpful message explaining why an error is caused by the destination setup. In other snowplow loaders we get the message simply by serializing the Exception. But in Lake Loader I found the exception messages to be very messy. In a related problem, for Hudi setup errors I needed to traverse the Exception's `getCause` in order to check if it was a setup error. This PR takes more explicit control of setting short friendly error messages, and traversing the `getCause` to get all relevant messages. E.g. an alert message before this change: > Failed to create events table: s3a://<REDACTED/events/_delta_log: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by V1ToV2AwsCredentialProviderAdapter : software.amazon.awssdk.services.sts.model.StsException: User: arn:aws:iam::<REDACTED>:user/<REDACTED> is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::<REDACTED>:role/<REDACTED> (Service: Sts, Status Code: 403, Request ID: 00000000-0000-0000-0000-000000000000) The corresponding alert after this change: > Failed to create events table: s3a://<REDACTED/events/_delta_log: Failed to initialize AWS access credentials: Missing permissions to assume the AWS IAM role **Other small changes I snuck into this commit:** - Added specific webhook alerts for Hudi. - Removed the AssumedRoleCredentialsProvider for aws sdk v1. This is no longer needed now that Hadoop is fully using aws sdk v2. - Fixed minor bug with retrying creating a database in Hudi Writer
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The webhook alert should contain a short helpful message explaining why an error is caused by the destination setup. In other snowplow loaders we get the message simply by serializing the Exception. But in Lake Loader I found the exception messages to be very messy.
In a related problem, for Hudi setup errors I needed to traverse the Exception's
getCause
in order to check if it was a setup error.This PR takes more explicit control of setting short friendly error messages, and traversing the
getCause
to get all relevant messages.E.g. an alert message before this change:
The corresponding alert after this change:
Other small changes I snuck into this commit: