-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ssa: Improve wait error reporting #722
Conversation
5ea7a31
to
1f28d12
Compare
If fail fast is enable, include only the failed resources in the returned error. Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
1f28d12
to
acf4807
Compare
if rs == nil { | ||
errors = append(errors, fmt.Sprintf("can't determine status for %s", utils.FmtObjMetadata(id))) | ||
continue | ||
var errs []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt think about
var errs []error
for id, rs := range statusCollector.ResourceStatuses {
switch {
case rs == nil || lastStatus[id] == nil:
errs = append(errs, fmt.Errorf("resource %s not found", utils.FmtObjMetadata(id)))
case lastStatus[id].Status == status.FailedStatus:
var builder strings.Builder
builder.WriteString(fmt.Sprintf("%s status: '%s'",
utils.FmtObjMetadata(rs.Identifier), lastStatus[id].Status))
if rs.Error != nil {
builder.WriteString(fmt.Sprintf(": %s", rs.Error))
}
errs = append(errs, errors.New(builder.String()))
...
if len(errs) > 0 {
msg := "failed early due to stalled resources"
if errors.Is(ctx.Err(), context.DeadlineExceeded) {
msg = "timeout waiting for"
}
return fmt.Errorf("%s: %w", msg, errors.Join(errs...))
}
Wrapping the errors
can make it easier to inspect for calling funtions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more about humans than code, keep in mind that the resulting error ends up in Slack, PagerDuty, etc. Is error wrapping with 100+ errors easier to read and parse than an array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.Error()
will return a string still. The difference is that for Join
. It says The error formats as the concatenation of the strings obtained by calling the Error method of each element of errs, with a newline between each string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok but this will change the format and if people rely on the current array, their parser will fail. I would consider change it in Flux minor release instead of a patch like now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Left a suggestion about simplification of duplicate code in the switch cases. Can be ignored or accepted.
case lastStatus[id].Status == status.FailedStatus: | ||
var builder strings.Builder | ||
builder.WriteString(fmt.Sprintf("%s status: '%s'", | ||
utils.FmtObjMetadata(rs.Identifier), lastStatus[id].Status)) | ||
if rs.Error != nil { | ||
builder.WriteString(fmt.Sprintf(": %s", rs.Error)) | ||
} | ||
if lastStatus[id] == nil { | ||
// this is only nil in the rare case where no status can be determined for the resource at all | ||
errors = append(errors, fmt.Sprintf("%s (unknown status)", utils.FmtObjMetadata(rs.Identifier))) | ||
} else if lastStatus[id].Status != status.CurrentStatus { | ||
var builder strings.Builder | ||
builder.WriteString(fmt.Sprintf("%s status: '%s'", | ||
utils.FmtObjMetadata(rs.Identifier), lastStatus[id].Status)) | ||
if rs.Error != nil { | ||
builder.WriteString(fmt.Sprintf(": %s", rs.Error)) | ||
} | ||
errors = append(errors, builder.String()) | ||
errs = append(errs, builder.String()) | ||
case errors.Is(ctx.Err(), context.DeadlineExceeded) && lastStatus[id].Status != status.CurrentStatus: | ||
var builder strings.Builder | ||
builder.WriteString(fmt.Sprintf("%s status: '%s'", | ||
utils.FmtObjMetadata(rs.Identifier), lastStatus[id].Status)) | ||
if rs.Error != nil { | ||
builder.WriteString(fmt.Sprintf(": %s", rs.Error)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to ignore it but the two cases above seem identical.
Maybe there are two cases to keep the conditionals simple but maybe it'd be simpler to maintain in the long run if the duplicate code can be put together in one case and the case expression can be made a little complex like:
case (lastStatus[id].Status == status.FailedStatus) || (errors.Is(ctx.Err(), context.DeadlineExceeded) && lastStatus[id].Status != status.CurrentStatus):
...
I may be missing something if they aren't identical. Please tell me if they aren't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did like this so it's easier to read, failure vs timeout.
If the context deadline has not been reached, include only the failed resources in the returned error. This makes the health check error message concise, instead of listing potentially hundreds of resources in progress, we only list the ones that actually failed.
Part of: fluxcd/flux2#4556
Fixes: #715
Supersedes: #717