Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert: allow users to disable broker heartbeats by not providing a timeout (#2097, #2016) #2104

Merged
merged 1 commit into from
Sep 5, 2024

Conversation

FrankK-1234
Copy link
Contributor

Hello,

In issue #2097 it is mentioned that #1997, #1998, #2016 caused heartbeats not to be sent any more if timeout is None or 0. I'm not sure if we want to revert the change from @smart-programmer but this pull request will do so. Perhaps @auvipy can give his opinion. I reverted the change and changed the test case to expect the opposite old behavior which is to send heartbeats even if timeout is None or 0.

Please review and decide if revert is wanted or not.

With kind regards,

Frank

@auvipy auvipy self-requested a review September 5, 2024 12:17
Copy link
Member

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please share more detail explanations please?

@FrankK-1234
Copy link
Contributor Author

FrankK-1234 commented Sep 5, 2024

Hello @auvipy,

Let me try and recap the changes. The original request #1997 change was to prevent heartbeats when timeout was None or zero. This was build by @smart-programmer and accepted by you after I've added some test (#2016). However the claim in issue #2097 is that this changed the behaviour of the interface (@vitorquintanilha). The heartbeats will not be send anymore if timeout is None or zero. After looking more closely I agree with the changed behaviour. Previously it always called conn.heartbeat_check and now only if timeout is set. For me the function name is perhaps confusing because it is not checking for hearbeats but actually sending them.

So the original code always called conn.heartbeat_check upon a timeout:

              except socket.timeout:
                    conn.heartbeat_check()
                    elapsed += safety_interval
                    if timeout and elapsed >= timeout:
                        raise

While the new code only called it when the timeout paramter was set and non zero:

                except socket.timeout:
                    if timeout:
                        conn.heartbeat_check()
                        elapsed += safety_interval
                        if elapsed >= timeout:
                            raise

This changed the behaviour for users which rely on heartbeat being send but who do not (want) to specify a timeout and thus broke their code after an upgrade. The workaround I suggested is to use a high timeout value. However I think we should consider reverting the change, which this pull request does.

I did not remove the test cases but adapted them to reflect the 'wanted' behaviour.

Is this enough explanation to decide what to do (not sure what to add more)?

Thanks in advance,

Frank

@FrankK-1234 FrankK-1234 requested a review from auvipy September 5, 2024 13:33
@auvipy
Copy link
Member

auvipy commented Sep 5, 2024

yes make sense!

@auvipy
Copy link
Member

auvipy commented Sep 5, 2024

should we reopen #1997? or this PR fix both?

@auvipy auvipy merged commit 33490ad into celery:main Sep 5, 2024
16 checks passed
@FrankK-1234
Copy link
Contributor Author

should we reopen #1997? or this PR fix both?

This PR does not fix #1997, not sure what to do with the PR the proposed solution is incorrect. The original problem, or not so nice behaviour in logging, remains. Is it worth the hassle?

@vitorquintanilha
Copy link

Thanks everyone for your help on this topic :)

@FrankK-1234 FrankK-1234 deleted the pr2097 branch September 5, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants