Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition on error returned after context cancellation #12743

Merged
merged 1 commit into from
Jul 10, 2019

Conversation

jsoriano
Copy link
Member

@jsoriano jsoriano commented Jul 1, 2019

After #11981 docker client watching event uses the context provided by
the metricset wrapper. This produces a race condition on error
reporting: the errors channel will receive a context cancelled error when
the context is done, so both paths can be chosen by the select. If the
errors one is chosen, an error will be reported and a reconnect will be
attempted, that will fail.

Alternativelly we could have created another context and cancel it after
the reporter is done, in a similar fashion to what was done before #11981,
but then we would be breaking the chain of derived contexts.

Fixing it this way at least by now to avoid the flaking errors that this
is causing.

After elastic#11981 docker client watching event uses the context provided by
the metricset wrapper. This produces a race condition on error
reporting: the errors channel will receive a context cancelled error when
the context is done, so both paths can be chosen by the select. If the
errors one is chosen, an error will be reported and a reconnect will be
attempted, that will fail.

Alternativelly we could have created another context and cancel it after
the reporter is done, in a similar fashion to what was done before elastic#11981,
but then we would be breaking the chain of derived contexts.
@jsoriano jsoriano added module review Metricbeat Metricbeat containers Related to containers use case flaky-test Unstable or unreliable test cases. [zube]: In Review Team:Integrations Label for the Integrations team labels Jul 1, 2019
@jsoriano jsoriano requested a review from a team as a code owner July 1, 2019 16:53
@jsoriano jsoriano self-assigned this Jul 1, 2019
Copy link
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may need a backport?

@jsoriano jsoriano merged commit d36878a into elastic:master Jul 10, 2019
@jsoriano
Copy link
Member Author

This may need a backport?

Ok, I will backport this

jsoriano added a commit to jsoriano/beats that referenced this pull request Jul 10, 2019
…tic#12743)

After elastic#11981 docker client watching event uses the context provided by
the metricset wrapper. This produces a race condition on error
reporting: the errors channel will receive a context cancelled error when
the context is done, so both paths can be chosen by the select. If the
errors one is chosen, an error will be reported and a reconnect will be
attempted, that will fail.

Alternativelly we could have created another context and cancel it after
the reporter is done, in a similar fashion to what was done before elastic#11981,
but then we would be breaking the chain of derived contexts.

(cherry picked from commit d36878a)
jsoriano added a commit that referenced this pull request Jul 11, 2019
…) (#12849)

After #11981 docker client watching event uses the context provided by
the metricset wrapper. This produces a race condition on error
reporting: the errors channel will receive a context cancelled error when
the context is done, so both paths can be chosen by the select. If the
errors one is chosen, an error will be reported and a reconnect will be
attempted, that will fail.

Alternativelly we could have created another context and cancel it after
the reporter is done, in a similar fashion to what was done before #11981,
but then we would be breaking the chain of derived contexts.

(cherry picked from commit d36878a)
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…tic#12743) (elastic#12849)

After elastic#11981 docker client watching event uses the context provided by
the metricset wrapper. This produces a race condition on error
reporting: the errors channel will receive a context cancelled error when
the context is done, so both paths can be chosen by the select. If the
errors one is chosen, an error will be reported and a reconnect will be
attempted, that will fail.

Alternativelly we could have created another context and cancel it after
the reporter is done, in a similar fashion to what was done before elastic#11981,
but then we would be breaking the chain of derived contexts.

(cherry picked from commit fcedb16)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containers Related to containers use case flaky-test Unstable or unreliable test cases. Metricbeat Metricbeat module review Team:Integrations Label for the Integrations team v7.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants