Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the job class as the default concurrency key if none is provided #1145

Merged
merged 7 commits into from
Feb 1, 2024

Conversation

Earlopain
Copy link
Collaborator

@Earlopain Earlopain commented Nov 7, 2023

Setting these two values to true will result in the job class name or queue name being part of the concurrency key without needing to explicitly specify them. This mimicks the settings available in the sidekiq-unique-jobs gem.

Both values are false by default to preserve backwards compatibility. In sidekiq-unique-jobs it's true by default, perhaps that is something that can be changed in a future major version.

Maybe a config.good_job setting can be added to change this for all jobs right now?

@bensheldon
Copy link
Owner

I'm a little reluctant on this, just because I feel like it would be easiest for me to manage "sensible default" and "full control" instead of a gradient of options.

Based on our convo in #1110, I think I'd like to focus just on "sensible default". It looks like sidekiq-unique-jobs uses queue-class-args as the default. What if we just did that?

@Earlopain
Copy link
Collaborator Author

Hm, I guess I could do that as well. It is basically what sidekiq enterprise is doing from what I understand.

Though for me the job class will always be part of the concurrency key and I'd rather opt out instead of having to again manually specify it when I need to set custom concurrency. The case of not scoping the key to the job seems more like the exception than the rule. I have no evidence to back that up though.

I'd still see your take as an improvement though. There must be something to it if software like sidekiq enterprise went down the same path. It would be very easy to document in the readme what is happening and give examples on what to potentially do when using your own arguments for the key.

One thing sidekiq has going for it is that it only allows trivial types to perform_later, anything that survives a normal json dump/load trip. Serialized AC job args can easily get quite large. Is that a concern when doing ActiveJob::Arguments.serialize(arguments)?

Am I right in thinking that you don't consider this a breaking change and more of a bugfix since it only affects cases where no key is specified at all?

@Earlopain Earlopain force-pushed the concurrency-unique-across branch from d006a22 to 19c7f93 Compare November 9, 2023 12:38
@Earlopain Earlopain force-pushed the concurrency-unique-across branch from 19c7f93 to 1f79e57 Compare November 9, 2023 12:46
@Earlopain Earlopain changed the title Implement unique_across_jobs/queues for concurrency Implement a default concurrency key if none is provided Nov 9, 2023
Copy link
Owner

@bensheldon bensheldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing sidekiq has going for it is that it only allows trivial types to perform_later, anything that survives a normal json dump/load trip. Serialized AC job args can easily get quite large. Is that a concern when doing ActiveJob::Arguments.serialize(arguments)?

What do you think about passing the serialized arguments through a fasthash function? like OpenSSL::Digest::MD5.hexdigest(ActiveJob::Arguments.serialize(arguments)) if arguments.any?. That would make them constant length.

Am I right in thinking that you don't consider this a breaking change and more of a bugfix since it only affects cases where no key is specified at all?

I think so; it was kind of unspecified before.

lib/good_job/active_job_extensions/concurrency.rb Outdated Show resolved Hide resolved
* Make the default concurrency key easibly monkey-patchable
* Support a nil concurrency key callable
@Earlopain
Copy link
Collaborator Author

What do you think about passing the serialized arguments through a fasthash function? like OpenSSL::Digest::MD5.hexdigest(ActiveJob::Arguments.serialize(arguments)) if arguments.any?. That would make them constant length.

I don't particularly like that since it would prevent me from learning about the original concurrency key with job.good_job_concurrency_key like the readme says.

My main point when asking about this was performance. I'm not even sure anymore if my concern was valid in the first place since nothing is stopping you right now from just passing a huge hash. It would be somewhat larger because of the serialization roundtrip with _aj_symbol_keys and all that jazz but not massivly.

@Earlopain
Copy link
Collaborator Author

@bensheldon is this something you are even interested in? Don't feel pressured into accepting some version of this if you don't feel strongy about this yourself, just because I opened this PR.

@bensheldon
Copy link
Owner

@Earlopain sorry, I've been internally waffling on this. I do want a default value, I'm just very conflicted about what the default value should be and imagine it will be difficult to change in the future. Hence my foot dragging 😞

I think I'm struggling because the simple value (just the job class name) is also the broadest, whereas the most complex value (class, queue, args) is the most targeted. I also share your concern about key length.

...so leaning towards just doing Job Class 🤔 But I can finish it out and not leave you hanging on this. I really do appreciate the code and the dialogue.

@Earlopain
Copy link
Collaborator Author

I understand, that's totally fine. I don't want to rush, take the time you need to think this through. Knowing that I will approach this differently.

I don't hate the idea of just using the job class. It's the most conservative approach to make things work instead of just letting everything through, if the user needs more relaxed concurrency they can do so themselves. If it's called out sufficiently in the readme then I wouldn't see much issue with this.

I took a look at what sidekiq-unique-jobs is doing in regards to large arguments, and it seems to just hash them exactly like you suggested. https://github.com/mhenrixon/sidekiq-unique-jobs/blob/684ce0bb7c84e1cf5431600a8dbc451cbf2abce8/lib/sidekiq_unique_jobs/lock_digest.rb#L54
Per the readme, good_job_concurrency_key is supposed to be a debugging function, for tests and the like. My concern about this not being readable anymore when hashing happens can be alleviated by simply computing this without the hashing (like it is right now) and move hashing that to another internal method. In hindsight that's pretty obvious.

Feel free to take this PR over and update it when you reached a conclusion. I can do it myself of course but no need to play telephone unless you want to.

@bensheldon
Copy link
Owner

Just helped someone debug this on Rails Link Slack and their (unmet) expectation is that the default is just the job class. I think I'll make that the default and document how to be more granular with arguments. Thanks for the work and back and forth on this 🙇🏻

@bensheldon bensheldon changed the title Implement a default concurrency key if none is provided Use the job class as the default concurrency key if none is provided Jan 31, 2024
@bensheldon bensheldon added the bug Something isn't working label Feb 1, 2024
@bensheldon bensheldon merged commit ad3f00f into bensheldon:main Feb 1, 2024
20 checks passed
@Earlopain Earlopain deleted the concurrency-unique-across branch February 1, 2024 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

Successfully merging this pull request may close these issues.

2 participants