Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix infinite loop on fast TCP disconnection #557

Merged
merged 1 commit into from
Feb 13, 2020

Conversation

galkinvv
Copy link

@galkinvv galkinvv commented Feb 9, 2020

The commit a841b28 changed the condition for removing job from processing.
New flag MultiplexerJobStatus::continue_servicing become used
instead of checking pointer for NULL.
However for cases when TCPSocket::newJob() returns nullptr
the behaviour changed: earlier the job was removed, but after change
it is called again, since MultiplexerJobStatus equal to {true, nullptr}
means "run this job again".

This leads to problem with eating CPU and RAM on linux
#470

There is similar windows problem, but not sure it is related.
#552

Since it looks that the goal of a841b28 was only clarifying
object ownership and not changing job deletion behaviour,
this commit tries to get original behaviour and fix the bugs above
by returning {false, nullptr} instead of {true, nullptr}
when TCPSocket::newJob() returns nullptr.

The commit a841b28 changed the condition for removing job from processing.
New flag MultiplexerJobStatus::continue_servicing become used
instead of checking pointer for NULL.
However for cases when TCPSocket::newJob() returns nullptr
the behaviour changed: earlier the job was removed, but after change
it is called again, since MultiplexerJobStatus equal to {true, nullptr}
means "run this job again".

This leads to problem with eating CPU and RAM on linux
debauchee#470

There is similar windows problem, but not sure it is related.
debauchee#552

Since it looks that the goal of a841b28 was only clarifying
object ownership and not changing job deletion behaviour,
this commit tries to get original behaviour and fix the bugs above
by returning {false, nullptr} instead of {true, nullptr}
when TCPSocket::newJob() returns nullptr.
@galkinvv
Copy link
Author

galkinvv commented Feb 9, 2020

This should fix #470. However the issue was rarely-reproducible for me, so some more testing may be needed.
@p12tic - since this PR partially revert a logic change introduced in a841b28 I have a question to double-check (since I don't have a complete understanding of barriers logic).

Does changing the logic in a841b28 when newJob() returns nullptr (treated as no new job) to {true, nullptr} (treated as retry current job ) was unintentional side effect or was explicit goal to fix some behaviour?

@p12tic
Copy link
Member

p12tic commented Feb 13, 2020

@galkinvv: This is really good analysis. I think this issue was indeed my oversight. I would have made a separate commit for any change in semantics.

@p12tic
Copy link
Member

p12tic commented Feb 13, 2020

@AdrianKoshka: I think you can merge this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants