Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added sync option to job configuration to run job synchronously / immediately #112

Closed
wants to merge 1 commit into from

Conversation

noahfpf
Copy link
Contributor

@noahfpf noahfpf commented Jul 10, 2024

There might be a case when you want some jobs to complete immediately after a call to Workflow#start!. For example, some jobs might complete very quickly and it can be helpful to show the results of those jobs in the response to some web request that started the workflow. Any job with the sync: true option is not sent to the queuing adapter but directly executed by blocking the execution of others until it’s finished.

…mmediately

There might be a case when you want some jobs to complete immediately after a
call to `Workflow#start!`. For example, some jobs might complete very quickly
and it can be helpful to show the results of those jobs in the response to some
web request that started the workflow. Any job with the `sync: true` option is
not sent to the queuing adapter but directly executed by blocking the execution
of others until it’s finished.
@pokonski
Copy link
Contributor

pokonski commented Jul 15, 2024

Hi, @noahfpf!

  1. How will this work if a sync job has dependencies? For example somewhere in the middle of a job graph
  2. What if sync job IS a dependency to an async job ?

I think instead of introducing this slippery slope of sync jobs one could simply do required setup inside configure method of the workflow, without the "job" concept

@noahfpf
Copy link
Contributor Author

noahfpf commented Jul 15, 2024

The only functional change is that Client#enqueue_job calls perform_now instead of perform_later, so in both of these cases dependencies work as they did before:

How will this work if a sync job has dependencies? For example somewhere in the middle of a job graph

If a sync job has dependencies, it'll enqueue them just like an async job would.

What if sync job IS a dependency to an async job ?

If a sync job is a dependency, it'll be run in the same ActiveJob worker instance as its parent.

I think instead of introducing this slippery slope of sync jobs one could simply do required setup inside configure method of the workflow, without the "job" concept

For my organization, the utility to this concept is that a Job class is a reusable block (e.g. we have an "Add Tag Job"). Sometimes its helpful to call that encapsulated logic immediately so that when a request completes the response can show the results of the job. Sometimes the same work can be done asynchronously.

But I certainly see that this feature has limited utility to others and might not be worth merging into Gush.

Maybe an interesting alternate approach would be to allow users of Gush to swap out the Gush::Worker call within Client#enqueue_job more easily? That would allow more flexible use of ActiveJob's API, e.g. would allow use of wait_until and priority params of ActiveJob::Core::ClassMethods.set?

@pokonski
Copy link
Contributor

pokonski commented Jul 17, 2024

Thank you for detailed explanation!

Maybe an interesting alternate approach would be to allow users of Gush to swap out the Gush::Worker call within Client#enqueue_job more easily?

I agree, this would be a more flexible way for users to tweak the behaviour without the need to add a passthrough for every possible ActiveJob parameter. Maybe a simple overrideable method for enqueuing that we can document ?

@natemontgomery
Copy link

I think the synchronous execution idea has some great use cases!

I agree with @pokonski, your last suggestions about using something to 'swap out' the Worker calls could have some nice outcomes. An overridable method seems very reasonable, or maybe a proc/lambda to encapsulate what ends up needing to run when we call #enqueue_job? There are some patterns around using case statements with objects that implement #call that we could use in that approach. Your point about being able to leverage more of the ActiveJob API is a really strong one, I think that idea will motivate a good solution no matter the approach.

@noahfpf
Copy link
Contributor Author

noahfpf commented Jul 22, 2024

@pokonski and @natemontgomery, here's a suggestion for how to allow Gush users to have more control over the use of Gush::Worker. It moves the call into Gush::Job since:

  1. The job already contains information about how the worker should be enqueued (e.g. queue and wait data).
  2. The job class is one that users are already subclassing, and so it's very easy to provide an alternative implementation (e.g. using perform_now based on a sync param).
diff --git a/lib/gush/client.rb b/lib/gush/client.rb
index c8e36d6..d866cd7 100644
--- a/lib/gush/client.rb
+++ b/lib/gush/client.rb
@@ -155,14 +155,9 @@ module Gush
     def enqueue_job(workflow_id, job)
       job.enqueue!
       persist_job(workflow_id, job)
-      queue = job.queue || configuration.namespace
-      wait = job.wait
 
-      if wait.present?
-        Gush::Worker.set(queue: queue, wait: wait).perform_later(*[workflow_id, job.name])
-      else
-        Gush::Worker.set(queue: queue).perform_later(*[workflow_id, job.name])
-      end
+      options = { queue: configuration.namespace }.merge(job.worker_options)
+      job.enqueue_worker!(**options)
     end
 
     private
diff --git a/lib/gush/job.rb b/lib/gush/job.rb
index e432648..669f0df 100644
--- a/lib/gush/job.rb
+++ b/lib/gush/job.rb
@@ -59,6 +59,14 @@ module Gush
       @failed_at = nil
     end
 
+    def enqueue_worker!(**options)
+      Gush::Worker.set(**options).perform_later(workflow_id, name)
+    end
+
+    def worker_options
+      { queue: queue, wait: wait }.compact
+    end
+
     def finish!
       @finished_at = current_timestamp
     end

What do you think of this approach? If it seems right, I'll submit a new PR (with specs).

The alternative approach I considered is:

  1. Allow a subclass of Gush::Client to be specified in Gush.configure
  2. Use that subclass wherever the code currently uses Gush::Client
  3. Split up Client#enqueue_job so that just the call to Gush::Worker.set(...).perform_later(...) can be overridden.

@pokonski
Copy link
Contributor

pokonski commented Jul 22, 2024

I think the diff you showed looks great, super simple and easy to override per-job! Bonus points for getting rid of the unnecessary conditional 😁

@noahfpf
Copy link
Contributor Author

noahfpf commented Jul 22, 2024

Thanks for the quick feedback! I've opened #117, and I'm closing this PR.

@noahfpf noahfpf closed this Jul 22, 2024
@noahfpf noahfpf deleted the job-sync branch July 22, 2024 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants