-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove erroneous status reporting #42435
Remove erroneous status reporting #42435
Conversation
2d811ee
to
43d111b
Compare
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
This commit removes a redundant status reporting from the Filestream input. `inp.readFromSource` can only return the error from the canceler, this error should not be reported to the manager/Elastic-Agent. `inp.readFromSource` is called by `filestream.Run`, which is called by the `startHarvester` function. This function already reports the error returned by `filestream.Run` and correctly filters out 'context cancelled' errors.
43d111b
to
bb3fb08
Compare
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
// there is no point in reporting it to the Manager (aka Elastic-Agent). | ||
// Also, the caller of Run, will correctly report the error and filter | ||
// out 'context cancelled'. | ||
return inp.readFromSource(ctx, log, r, fs.newPath, state, publisher, metrics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the case today, but what if someone changes the logic inside readFromSource to return another kind of error? This promise would then become invalid and could cause silent failures. If you mean that Run up the chain has logic to report the errors then it is fine, then disregard my comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean that Run up the chain has logic to report the errors then it is fine, then disregard my comment.
Yes, that's what I meant.
However if it wasn't clear when you read the comment, this means it can be improved ;)
Do you have any suggestions on how I can make it clear? So future changes won't fall into the trap of trying to report it?
Should I go with something simpler like:
The caller of Run already reports the error and filter out errors that must not be reported, like 'context cancelled'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I like your suggestion, it is both clearer and shorter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UpdateStatus
could probably just unconditionally filter out context.Cancelled
and the Beats context equivalent.
context.Cancelled
is not an actionable user error, it's something that would only ever arise because of a bug. We could log it but not change the agent state perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree if the error is context.Cancelled
it's coming from a bug, however it could also break/stop something essential (like an input) and if that affects the data ingestion or the overall behaviour of the Elastic-Agent it should be reported to the user so they know something is wrong.
Ideally there would be no bug, but if I have to choose between a silent bug and a verbose/noisy one, I'll take the noisy one, it's gonna be much easier to catch and less likely to make it into a final release.
This commit removes a redundant status reporting from the Filestream input. `inp.readFromSource` can only return the error from the canceler, this error should not be reported to the manager/Elastic-Agent. `inp.readFromSource` is called by `filestream.Run`, which is called by the `startHarvester` function. This function already reports the error returned by `filestream.Run` and correctly filters out 'context cancelled' errors. (cherry picked from commit 1a0a732)
This commit removes a redundant status reporting from the Filestream input. `inp.readFromSource` can only return the error from the canceler, this error should not be reported to the manager/Elastic-Agent. `inp.readFromSource` is called by `filestream.Run`, which is called by the `startHarvester` function. This function already reports the error returned by `filestream.Run` and correctly filters out 'context cancelled' errors. (cherry picked from commit 1a0a732) Co-authored-by: Tiago Queiroz <tiago.queiroz@elastic.co>
This commit removes a redundant status reporting from the Filestream input. `inp.readFromSource` can only return the error from the canceler, this error should not be reported to the manager/Elastic-Agent. `inp.readFromSource` is called by `filestream.Run`, which is called by the `startHarvester` function. This function already reports the error returned by `filestream.Run` and correctly filters out 'context cancelled' errors. (cherry picked from commit 1a0a732)
This commit removes a redundant status reporting from the Filestream input. `inp.readFromSource` can only return the error from the canceler, this error should not be reported to the manager/Elastic-Agent. `inp.readFromSource` is called by `filestream.Run`, which is called by the `startHarvester` function. This function already reports the error returned by `filestream.Run` and correctly filters out 'context cancelled' errors. (cherry picked from commit 1a0a732)
Needs to go in 8.18 |
This commit removes a redundant status reporting from the Filestream input. `inp.readFromSource` can only return the error from the canceler, this error should not be reported to the manager/Elastic-Agent. `inp.readFromSource` is called by `filestream.Run`, which is called by the `startHarvester` function. This function already reports the error returned by `filestream.Run` and correctly filters out 'context cancelled' errors. (cherry picked from commit 1a0a732)
This commit removes a redundant status reporting from the Filestream input. `inp.readFromSource` can only return the error from the canceler, this error should not be reported to the manager/Elastic-Agent. `inp.readFromSource` is called by `filestream.Run`, which is called by the `startHarvester` function. This function already reports the error returned by `filestream.Run` and correctly filters out 'context cancelled' errors. (cherry picked from commit 1a0a732)
Proposed commit message
This commit removes a erroneous status reporting from the Filestream input.
inp.readFromSource
can only return the error from the canceler, this error should not be reported to the manager/Elastic-Agent.inp.readFromSource
is called byfilestream.Run
, which is called by thestartHarvester
function. This function already reports the error returned byfilestream.Run
and correctly filters out 'context cancelled' errors.Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesI have added tests that prove my fix is effective or that my feature worksCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.## Disruptive User Impact## Author's ChecklistHow to test this PR locally
That's a tricky PR to test because there is a timing issue involved, essentially using a local Kubernets cluster, deploy Elastic-Agent
v8.17.1
collecting logs, make containers generate so many logs the host machine is going to have it's CPU at 100%, the Filestream input will start reporting unhealthy without the changes of this PR.There are some more details of how I reproduced/tested it here: elastic/elastic-agent#6596 (comment)
Related issues
8.17.1
elastic-agent#6596## Use cases## Screenshots## Logs