Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StickyTaskQueueDrainTimeout #2019

Merged
merged 3 commits into from
Mar 28, 2024

Conversation

Quinn-With-Two-Ns
Copy link
Contributor

@Quinn-With-Two-Ns Quinn-With-Two-Ns commented Mar 26, 2024

Add an option to drain sticky task queue during graceful shutdown. Worker will stop polling on the normal task queue and only poll on the stick task queue to complete workflows while the worker is shutting down.

There is a lot of follow up work we could do around here based on user demand and if the server adds some APIs. For know this is the minimal implementation to avoid sticky workflow timeout for short workflow during graceful shutdown.

Add an option to drain sticky task queue
during graceful shutdown.
@Quinn-With-Two-Ns Quinn-With-Two-Ns marked this pull request as ready for review March 27, 2024 15:24
@Quinn-With-Two-Ns Quinn-With-Two-Ns requested a review from a team as a code owner March 27, 2024 15:24
@@ -77,6 +77,23 @@ public CompletableFuture<Void> waitForSemaphorePermitsReleaseUntimed(
return future;
}

/**
* waitForStickyQueueBalancer -&gt; disableNormalPoll -&gt; timed wait for graceful completion of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fucking Javadoc lol

* shutting down the worker. If set the worker will stop making new poll requests on the normal
* task queue, but will continue to poll the sticky task queue until the timeout is reached.
* This value should always be greater than clients rpc long poll timeout, which can be set via
* {@link WorkflowServiceStubsOptions.Builder#setRpcLongPollTimeout(Duration)}.
Copy link
Member

@cretz cretz Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I wonder if it'd make more sense to make this a boolean for waitForStickyEmptyPoll, because what if you have a time race? For instance, what if my long poll timeout and my sticky drain timeout is 1m. And as soon as the poll comes back empty I haven't hit sticky drain timeout exactly yet, so I make another sticky poll for just some nanoseconds before sticky drain timeout does hit? Maybe that's ok though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That poll would be cancelled, there is a possible race, but the race also exists for normal shutdown as well and I don't think we should tackle it here.

Copy link
Member

@cretz cretz Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Would like to see us not kill poll calls anywhere and always wait for graceful completion, but yes can be tackled elsewhere. I wonder if it's worth an issue on that to ensure SDK behavior wrt poll interruption.

@Quinn-With-Two-Ns Quinn-With-Two-Ns merged commit 0c6c566 into temporalio:master Mar 28, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants