Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing info from README #6

Closed
manuelmeurer opened this issue Nov 13, 2012 · 11 comments
Closed

Missing info from README #6

manuelmeurer opened this issue Nov 13, 2012 · 11 comments

Comments

@manuelmeurer
Copy link

I just found this project while googling how to make sure certain Sidekiq jobs are not executed multiple times. sidekiq-unique-jobs seems to do exactly that... awesome!

I think there is some info missing in the README though, specifically:

  • Are worker arguments taken into account? So if I have a HardWorker and I call HardWorker.perform_async('bob', 5) multiple times, that job should obviously only be queued once. But what if I call HardWorker.perform_async('bob', 5) and HardWorker.perform_async('jane', 10)? Are both those jobs queued? I suppose so but I'm not 100% sure.
  • Why is the expiration parameter needed? Does it mean that by default the same job cannot be enqueued again up to 30min after it was removed from the queue?

I think both these points (and possibly more) should be explained in the README.
I'm happy to prepare a pull request for it, if you answer my questions in here.

Thanks for your work on this!

@philostler
Copy link
Contributor

+1

3 similar comments
@varunlalan
Copy link

+1

@mhodgson
Copy link

mhodgson commented Apr 9, 2013

+1

@abacha
Copy link

abacha commented Apr 25, 2013

+1

@mhenrixon
Copy link
Owner

Ok so be gentle with me while I try to explain. I'll be the first one to admit I suck at both documentation and explaining this. Work in progress sort of speak.

  • Worker arguments are taken into account. It is also possible to select what arguments should be taken into consideration for uniqueness by specifying a lambda or a class method to handle this. Might have been added since after you asked your question by the by.
  • The expiration doesn't need to be set, it can be thought of as a simple time out for uniqueness meaning if you set it to two hours no jobs with the same arguments or unique_arguments will be scheduled until that time has passed.

Unfortunately it seems like workers calling nested workers causes jobs to be duplicated still (like in #10 ). If anyone want to take a stab at reproducing the problem in a test we should be able to fix it.

@nberger
Copy link

nberger commented Jun 20, 2013

I don't understand what exactly the expiration parameter does, either. Does it only affect jobs scheduled with #perform_in, but not with #perform?

If it affects #perform, I think a better default should be 0, instead of the current 30 * 60 (30 minutes).

@mhenrixon
Copy link
Owner

@nberger it only affects jobs scheduled with perform_in or perform_at.

@astjohn
Copy link

astjohn commented Feb 27, 2014

@mhenrixon I wouldn't mind also a brief explanation in the README of exactly how the uniqueness is established. This would be nice so I don't have to dig through the code to ensure locking is performed properly. I would much rather take your word for it!

That said, I noticed that setex is used to increment a counter on a key that is the arguments to the worker. Is that correct? I noticed a pattern in the redis documentation using setnx (which now states to use plain old set to implement a locking system instead.

A small explanation of the locking procedure and how it is thread safe would definitely help me, and I'm sure many others, gain even more confidence in using the gem. Any chance you could clarify it for me? Thanks!

@tyetrask
Copy link

Hello everyone,

I had a few questions about this gem and this issue thread seems to be somewhat centered around my questions.

Essentially, if I have a worker ("DoStuff") and I queue a job for that worker by the following:

DoStuff.perform_async("with unique argument")

and then I run that same command again with the same arguments, I want it to not add the second instance of the job to the queue if the first instance has not completed yet.

The way I've read the documentation, I expect the job not to be duplicated no matter how long it has been if a job with the same worker, queue, and arguments is still waiting to be processed.

What I'm currently experiencing is that the job won't add to the queue if it's within the expiration time. However, if the first job has not been completed, but the unique job expiration time has passed (10 minutes, in my case) and I run it again, it does add the duplicate job even if the first one has not completed!

Is this the expected behavior of the gem? If not, is there a configuration option I am missing?

Here is an example of a worker and the options I'm using:

class DoStuff
  include Sidekiq::Worker
  sidekiq_options :queue => "queue_1"
  sidekiq_options unique: true, unique_job_expiration: 60 * 10

  def perform(arguments)
    # Do some unique things.
  end
end

Thanks,
Tye

@mhenrixon
Copy link
Owner

@tyetrask yeah that is sort of expected. I suggest you try something like sidekiq-throttler instead. That should better help you achieve what you want I am opening an issue for deciding on how to proceed with this.

@tyetrask
Copy link

Hey @mhenrixon, thanks for the information! We needed to move forward with our project so we ended up writing the middleware that performed how we needed. I appreciate all of the work on this and will be keeping an eye on it in the future. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants