Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs not pushed when using sidekiq-status #412

Closed
ky1007 opened this issue Jul 27, 2019 · 25 comments · Fixed by #455
Closed

Jobs not pushed when using sidekiq-status #412

ky1007 opened this issue Jul 27, 2019 · 25 comments · Fixed by #455
Assignees
Labels

Comments

@ky1007
Copy link

ky1007 commented Jul 27, 2019

Describe the bug
Job is deleted but lock remains (the Sidekiq queue the worker adds jobs to is completely clear and there are no scheduled jobs in Sidekiq for the worker or any retries for the worker either)

Expected behavior
Expect the lock to be cleared

Current behavior
The lock seems to still exist. Running ExportDataWorker.perform_in 1.hour, some_id returns nil instead of a job id

Worker class

class ExportDataWorker
  include Sidekiq::Worker
  include Sidekiq::Status::Worker

  sidekiq_options queue: :export, lock: :until_executed
  def perform(some_id); end
end

Also kinda curious--how does the locking and unlocking implementation work?

@mhenrixon mhenrixon added the bug label Jul 27, 2019
@mhenrixon mhenrixon self-assigned this Jul 27, 2019
@mhenrixon
Copy link
Owner

mhenrixon commented Jul 27, 2019

Hi @ky1007 this is a known problem.

See https://github.com/mhenrixon/sidekiq-unique-jobs#cleanup-dead-locks for how to cleanup some situations that you might run into.

You can also refer to https://github.com/mhenrixon/sidekiq-unique-jobs#sidekiq-web for how to manually find and remove locks.

Lastly, the locking is complex. It is hard to describe since some locks are treated differently but in the case you have above it works like the following:

  1. ExportDataWorker.perform_async(1) uses the Sidekiq::Client.push interface that runs through the client middleware you have configured and a lock is created.
  2. The sidekiq processor pops of your job from the queue and sends it to your worker. This code is wrapped in a server middleware that should take care of deleting the lock if everything was ok.

A couple of things can happen. Sometimes if your sidekiq worker is restarted or paused the lock can hang. Sometimes in exceptional cases the lock can hang. This won't be a problem when v7 is released but I have a hard time carving out enough time to finish it.

@mhenrixon mhenrixon pinned this issue Jul 27, 2019
@KevinColemanInc
Copy link

KevinColemanInc commented Jul 29, 2019

@mhenrixon

should take care of deleting the lock if everything was ok.

  1. If I wanted to manually add an release/unlock to the end of the job, how would we do that? Or it'd be great if you could just point me into the code where the release is happening.

  2. I trying to understand the new documentation vs the old. I am using lock_expiration expecting that the lock will never last longer than the given time. But it seems master just has lock_timeout and lock_ttl, which behave a bit differently?

  3. Is there a way to expire the lock after a certain amount of time passes from when the job starts? The lock stays forever while the job is in the queue, but once the job has started processing, the lock will expire in 15s. So even if the job fails to release the lock when its finished, the expiration will release it?

@ky1007
Copy link
Author

ky1007 commented Jul 29, 2019

Thanks for your response @mhenrixon!

I have the Sidekiq Web extension already setup so I can find and delete locks, but how I can find the unique digest from my deleted job? (so I can find which lock I can delete)

I noticed you had a #create_digest method that uses Digest::MD5.hexdigest(Sidekiq.dump_json(digestable_hash)) (in the v6.0.13 we use) to create the digest so I tried to recreate the digest via:

foo = Sidekiq.dump_json({ 'class': 'ExportDataWorker', queue: 'export', 'unique_args': [some_id]})
"*#{Digest::MD5.hexdigest(foo)}"

Despite searching in Sidekiq web with what the last line returns, I couldn't find any locks that matched.

Just wondering if perhaps I missed something or am re-creating the digest incorrectly.

@KevinColemanInc
Copy link

@ky1007

Are you using any sort of error reporting system? like bug snag or sentry? I am noticing that if a job throws an exception, the unique keys aren't being removed and we use sentry.

@mhenrixon
Copy link
Owner

mhenrixon commented Aug 13, 2019 via email

@jacquescrocker
Copy link

jacquescrocker commented Aug 17, 2019

I think I'm having this issue too. My async jobs are returning nil and not doing anything. Unique Digests is empty. Deleting SidekiqUniqueJobs got things running again

sidekiq (5.2.5)
sidekiq-unique-jobs (6.0.12)

@KevinColemanInc
Copy link

Unique Digests is empty

I noticed that the web UI for unique digest doesnt' actually show all of the unique digests. If you add redis-browser to your project, you will still see digests in the db.

@mhenrixon
Copy link
Owner

Thanks for the report @KevinColemanInc, would be great if you could provide an example project where this always happen. If not I'll look into it as soon as I wrap up v7.

@jacquescrocker can you provide some more details about your setup? Again an example project that show the problem would be super helpful.

@mhenrixon
Copy link
Owner

mhenrixon commented Oct 5, 2019

@KevinColemanInc @jacquescrocker @ky1007 you might want to take v6.0.15 for a spin. It fixed an old bug with duplicate jobs being allowed due to logger returning true. It also ensures compatibility with Sidekiq v6.0.1

@mhenrixon mhenrixon unpinned this issue Oct 5, 2019
@KevinColemanInc
Copy link

The ruby version we are using for this project is 2.4 which isn't supported by sidekiq 6, so it maybe a while before we will be able to try out those changes :-/

@mhenrixon
Copy link
Owner

Oh, you don’t need ruby 2.5 for this gem though. The gem is compatible with ruby 2.4, 2.5 and 2.6.

The new version includes some locking fixes in on top of sidekiq 6.0.1 compatibility but doesn’t remove compatibility with MRI 2.4 in any way.

@KevinColemanInc
Copy link

Oh, I thought the latest version was only supporting Sidekiq 6.0. I will try it out!

@KevinColemanInc
Copy link

It is definitely still happening on the new version :-/ I will try to find time to create a test repo for you.

@unlimit
Copy link

unlimit commented Oct 21, 2019

I faced with the same issue, but in my case reason was following:
Instead of passing class const here
Sidekiq::Client.push('class' => MyWorker, 'args' => [1, 2, 3])
I passed class string
Sidekiq::Client.push('class' => 'MyWorker', 'args' => [1, 2, 3])

Job will be performed in both cases, but lock will be removed only for class const.

@mhenrixon
Copy link
Owner

@unlimit thanks! I will make sure the lock is always a string!

@mhenrixon
Copy link
Owner

@unlimit I can't replicate the problem you describe with passing constant vs passing string. I try both and my tests pass nicely. Could you provide a repo where this problem is show cased?

@unlimit
Copy link

unlimit commented Nov 26, 2019

@mhenrixon unfortunately I can't provide you with a repo, but I can provide you with details:
ruby-v '2.5.1'
sidekiq '6.0.0'
sidekiq-unique-jobs '6.0.13'

Could you please show your test?

@mhenrixon
Copy link
Owner

@mhenrixon mhenrixon changed the title Job is deleted but lock remains Jobs not pushed when using sidekiq-status Nov 26, 2019
@KevinColemanInc
Copy link

I am definitely still experiencing this problem as well. I don't know if it make sense to close this issue yet.

@mhenrixon
Copy link
Owner

@KevinColemanInc I’m sorry I wasn’t clear enough a out why I closed the issue.

These issues are or will be fixed in v7 which is also more robust. As much as I’d like to support v6 I just don’t have enough time in a day to do both.

I’ve rewritten the locking mechanism almost completely, added features and am working on some guides for using the various features.

I’d urge you to try out v7 and open new issues for any problems you encounter.

Does that make sense at all? I get it probably isn’t good enough but there is always sidekiq pro and enterprise that you can use.

I hate that I don’t have enough time to focus on this gem. Paid work and family must take priority for me.

@unlimit
Copy link

unlimit commented Nov 27, 2019

@mhenrixon please check this repo - https://github.com/unlimit/sidekiq_uniq_example. You will find details in readme.
I also think you shouldn't close this issue. v7 also may have it!

@mhenrixon
Copy link
Owner

@unlimit upgrade to v6.0.18. The problem is no more

@mhenrixon
Copy link
Owner

mhenrixon commented Nov 28, 2019

@KevinColemanInc same for you and @ky1007. Sidekiq changed a little in v6 :) just give v6.0.18 a whirl

@unlimit
Copy link

unlimit commented Nov 28, 2019

Yes, lock was properly removed for v6.0.18! Thank you!

@pdkproitf
Copy link

I'm having the same problem with sidekiq-unique-jobs (7.1.2) and sidekiq-unique-jobs (7.1.5).
When I initiated a job, it returned nil. The lock stays there forever.

Screenshot 2021-08-13 at 12 05 31 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants