Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert many changes that break systemd #308

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jclusso
Copy link
Contributor

@jclusso jclusso commented Jul 26, 2022

I'm the original author of the sidekiq multi process code. I think I have some useful insights into what's happened since my PR was merged.

In #296, an issue was opened about not supporting older versions of systemd that did not support multiple process service naming with the @ symbol. The solution for this was to make it so that when sidekiq_processes is set to 1, it uses the sidekiq_service_unit_name without the @.

However, the changes implemented in #297 changed the sidekiq_service_file_name method to never have the @ sign, and they changed the sidekiq_service_unit_name method to have some conditional logic that doesn't always work.

As a result of the changes in #297, #298 was created, which led to #299. #299 modified the sidekiq:install task's functionality in an incorrect way, changing it to create multiple sidekiq@[number].service files, which is not how systemd multiple process support is supposed to work. The purpose of the @ sign in systemd service file names is that everything after the @ is passed in as an argument. Therefore, the correct service file name for a multiple process service is
"sidekiq@.service". This allows for the full functionality of systemd service arguments. For example, ranges are supported, so you can run a command like systemctl restart sidekiq@{1..3} and that single command will execute for all 3 processes. Even though creating multiple files does technically work, I don't believe it is correct, and there is no benefit to having multiple identical service files when you only need one.

In order to address the issues created by the changes mentioned above, #300 was created. However, it did not fix everything, which led to #302 and #304, which are further attempts to fix issues created by these changes. Ultimately, you end up with a whole series of changes attempting to patch issues going back to #296. In my opinion, things would be simpler and easier if the code went back to behaving closer to how things were around #296.

The code in this pull request does that. When sidekiq_processes is set to one, the systemd unit file will be named "sidekiq.service" and all commands will run on the service named "sidekiq". When sidekiq_processes is greater than one, the systemd unit file will be named "sidekiq@.service" and all commands will run either a systemd multi-process command, like sidekiq@{1..3}, or, in the case of the restart command, it will run on each numbered process in turn (first sidekiq@1, then sidekiq@2, etc.).

Finally, in regards to issues around backward compatibility, you can simply run sidekiq:uninstall before upgrading and sidekiq:install after upgrading. This makes more sense than over complicating the code.

@legendetm
Copy link

I also tested this through and this was working as expected.

For me only sidekiq:uninstall was not working, because the user does not have sudo permission, but this is not directly related with this PR.

I was also trying to setup multiprocess Sidekiq services today and and it was not working as expected. Started to read about how systemd works @ and tried to fix the same problem.

Thankfully I noticed your PR that resolved the multiple process problem. Thank you.

Some resources that I was reading to understand how the systemd with @ is working

@jclusso
Copy link
Contributor Author

jclusso commented Jul 26, 2022

@legendetm I will be honest I don't know much about the permissions thing but I know this option exists sidekiq_service_unit_user which maybe can help. I just kept existing stuff like that in place when I added multi process support.

@legendetm
Copy link

@jclusso it would be nice to update the documentation that before changing the sidekiq_processes count, you should run sidekiq:uninstall and after the changes sidekiq:install again. This is needed when upgrading from 1 process to multiple processes or when downgrading processes. When you already have multiple processes (2 or more) and adding more processes, then just sidekiq:install is enough.

And regarding permission issue, I also think that sudo should be only used when sidekiq_service_unit_user is system. But as I said, not directly related to this PR.

@jclusso
Copy link
Contributor Author

jclusso commented Jul 27, 2022

@legendetm are you referring to the usage of sudo for sidekiq:install and sidekiq:uninstall? Yes the documentation could be updated to include that. I was wondering if maybe there should be a sidekiq:configure or sidekiq:reconfigure that really just runs uninstall/install under the hood to make it easier.

@legendetm
Copy link

@jclusso I meant we should mention in documentation that you need to run sidekiq:uninstall with before upgrading (before changing the sidekiq_processes count) and sidekiq:install after the changes are done. Something like you mentioned in your PR

Finally, in regards to issues around backward compatibility, you can simply run sidekiq:uninstall before upgrading and sidekiq:install after upgrading. This makes more sense than over complicating the code.

Other option would be sidekiq:configure logic, but this may be complicated when decreasing workers, then you need search all enabled services with or without arguments, but maybe there is already a command for this. But this would definitely make the changing the sidekiq_processes count easier. I'm not a maintainer, but I would prefer separate improvement PR for this when this is merged.

Regarding the sudo part, this is unrelated to this PR. Just in my case the deploy user has no sudo privileges and therefore uninstall fails, but this is tracked in separate issue and I can make PR latter if needed.

…o users know they need to run the uninstall and install commands.
@jclusso
Copy link
Contributor Author

jclusso commented Oct 11, 2022

@seuros where do we stand on getting this merged or are you integrating this into other changes?

@jclusso
Copy link
Contributor Author

jclusso commented Nov 29, 2022

Hey @seuros, just checking in to see if you've had a chance to review this.

@seuros
Copy link
Owner

seuros commented Nov 29, 2022

Can you try master branch or the prerelease?

@jclusso
Copy link
Contributor Author

jclusso commented Nov 29, 2022

@seuros I just reviewed the v3.0.0.alpha.1 and it does not appear to be correctly using Systemd's multiprocess support. Did you get a chance to review everything I wrote above?

@jclusso
Copy link
Contributor Author

jclusso commented Jan 26, 2023

@seuros checking in on this again since it's been quite some time.

@zedtux
Copy link

zedtux commented Mar 16, 2023

@seuros could you please help us to solve this issue impacting more and more people please? 🙏

@jclusso
Copy link
Contributor Author

jclusso commented Apr 4, 2023

@seuros checking on this again...

@jclusso
Copy link
Contributor Author

jclusso commented Sep 28, 2023

@seuros any chance you've gotten to review this?

@seuros
Copy link
Owner

seuros commented Sep 29, 2023

I will fix this gem in this weekend, do you mind if i ping you for testing ?

@jclusso
Copy link
Contributor Author

jclusso commented Sep 29, 2023

I will fix this gem in this weekend, do you mind if i ping you for testing ?

No problem - my availability this weekend may be limited, but I'll try and respond!

@jclusso
Copy link
Contributor Author

jclusso commented Feb 29, 2024

@seuros is there any update on this?

@jclusso
Copy link
Contributor Author

jclusso commented Aug 20, 2024

@seuros is there going to be an update on this?

Inject sidekiq:stop after deploy:reverted
@seuros
Copy link
Owner

seuros commented Aug 21, 2024

I will look into it.

@jclusso
Copy link
Contributor Author

jclusso commented Jan 22, 2025

@seuros any update on this?

@seuros
Copy link
Owner

seuros commented Jan 23, 2025

Did you test this with the latest release, i can release a new version then.

@jclusso
Copy link
Contributor Author

jclusso commented Jan 23, 2025

Did you test this with the latest release, i can release a new version then.

@seuros I addressed that the last time you asked in this thread here. Nothing has changed since then from what I can tell.

I'd ask that you review my initial comment in this PR which explains everything.

@seuros
Copy link
Owner

seuros commented Jan 23, 2025

@jclusso , It just because i see conflicts with the master branch. let rebase this PR and get it merged. I will release it today if this PR is ready.

@legendetm
Copy link

legendetm commented Jan 23, 2025

Current implementation creates separate service file for every Sidekiq configuration file and it does not contain @ symbol anymore, as it used previously.

The correct way to support Systemd's multiprocess, is to create only one service file with @ to pass argument to the service file. Following is copied from documentation

Unit names can be parameterized by a single argument called the "instance name". The unit is then constructed based on a "template file" which serves as the definition of multiple services or other units. A template unit must have a single "@" at the end of the unit name prefix (right before the type suffix). The name of the full unit is formed by inserting the instance name between "@" and the unit type suffix. In the unit file itself, the instance parameter may be referred to using "%i".

Example: you have one service file app-sidekiq@.service and then use app-sidekiq@argument.service syntax to start multiple processes with different arguments:

  • app-sidekiq@1.service, app-sidekiq@2.service and app-sidekiq@3.service, where the 1, 2, 3 can be used as %i in service template. Alternative can be use app-sidekiq@{1..3}.service
  • app-sidekiq@sidekiq-1.service, app-sidekiq@sidekiq-2.service, app-sidekiq@sidekiq-3.service where sidekiq-1, sidekiq-2, sidekiq-3 can be used as %i in service template. If just numbers are different, then alternative usage app-sidekiq@sidekiq-{1..3}.service. If they are different, then alternative usage app-sidekiq@{sidekiq-1,sidekiq-2,sidekiq-3}.service. See Brace Expansion.

I think that the version 3 is improvement from initial implementation, as it's not using incorrect @ symbol in filename, and having just one sidekiq_config_files configuration is better than previous sidekiq_processes and sidekiq_config.

@jclusso
Copy link
Contributor Author

jclusso commented Jan 23, 2025

I think that the version 3 is improvement from initial implementation, as it's not using incorrect @ symbol in filename, and having just one sidekiq_config_files configuration is better than previous sidekiq_processes and sidekiq_config.

@legendetm you've confused me here. Are you suggesting the way things are now is better?

@jclusso
Copy link
Contributor Author

jclusso commented Jan 23, 2025

@legendetm I think I understand what you're saying. The problem with eliminating the option like this is that at scale it becomes a mess. We have sidekiq_processes set to 12 which means we'd need to duplicate the config 12 times.

@seuros
Copy link
Owner

seuros commented Jan 23, 2025

@jclusso, can you make a mock config in the https://github.com/seuros/capistrano-example-app ? just create a new env. https://github.com/seuros/capistrano-example-app/tree/main/config/deploy

That will help document and discuss your cases.

@legendetm
Copy link

I meant, that version 3 is better than initial implementation that was before you PR.

Form current implementation I like that it's enough just configuring sidekiq_config_files, as it's more clear, than setting processes count + config files. It's more visible how many processes there are and which configuration files they are using. But when all configurations are using some config, then processes count can be useful and by default it can fallback to sidekiq_config_files array size.

I would prefer single service file with argument usage, but I think probably current version 3 would also be fine, if it supports same file multiple times in array (haven't tried it yet).

But when looking code, then it seems current version 3 does not support same file multiple times in configuration file, as it adds the config filename without extension to service file, not using the index https://github.com/seuros/capistrano-sidekiq/blob/d73ad3c66ab60e549e0533fc3397f5e4bfb45eaf/lib/capistrano/tasks/systemd.rake#L130C7-L130C88

In our case, we have a lot of deploy stages with different Sidekiq config files combinations.

  • [sidekig1.yml, sidekig2.yml]
  • [sidekig1.yml, sidekig2.yml, sidekig3.yml]
  • [sidekig1.yml, sidekig2.yml, sidekig3.yml, sidekig1.yml, sidekig2.yml] (repeating same configs, to have more instances, because each file use only one core).

Also these files in our case are in config/sidekiq/ folder, and sidekiq_config is [config/sidekiq/sidekiq1.yml, ...], when using the code that is based on your PR.

@jclusso
Copy link
Contributor Author

jclusso commented Jan 23, 2025

@seuros this isn't just code I can rebase because you've accepted bad PRs that broke stuff and then moved so much around in 3.0. Essentially all of my work on #279 is either broken or lost in your refactorings it seems.

There were key things I added that now are lost.

  • Properly using System's multi process support. This has been replaced with a different way of running multiple processes that is a mess. Instead of being able to have one systemd process running you would have one for each config. If you had sidekiq_processes set to 12, now you'd need 12 configs instead of just being able to reuse the same one. You'd also have 12 systemd services running.
  • Restarting Sidekiq gracefully was removed. I had added a quiet task and a restart task that waited for quiet for 30 seconds.

@seuros
Copy link
Owner

seuros commented Jan 23, 2025

@jclusso, I apologize for merging those PRs without fully considering their impact. The implementation seemed to work at the time, but we lack testing in this gem, which led to these issues. This is precisely why I created a separate repo with a sample app , to manually test different scenarios and ensure stability.

Regarding the number of services, whether it’s 1 or 10, I’ll spin up an EC2 instance to test the deployment thoroughly and address the problems. Let’s work together to fix this and restore your approach for gracefully terminating Sidekiq, including the quiet task and restart task with a 30-second wait. I’ll prioritize getting a release out as soon as possible.

@jclusso
Copy link
Contributor Author

jclusso commented Jan 23, 2025

@seuros I'm going to start by trying to clone the latest version and see if I can try and find my way around where everything moved. Merging or rebasing the code at this point is too confusing. Appreciate you helping resolve this!

@jclusso
Copy link
Contributor Author

jclusso commented Jan 23, 2025

@seuros I'm concerned I don't know enough about why things were changed at this point to really fix this.

One example is, previously the sidekiq.service.capistrano.erb mimic'd the official one included in the Sidekiq repo. Now that is different, but I don't know why since there is no real reason I can imagine that we'd want to deviate from the template provided.

Various configuration options were removed from systemd, but linger in the monit config. Is this an accident? Is monit even supported or used by anyone? It was removed from the documentation, but parts of it are lingering around.

I'm pretty lost based on the current state of things.

@seuros
Copy link
Owner

seuros commented Jan 23, 2025

The current template includes all the settings from the official Sidekiq template(correct me if I'm wrong), minus the comments, and adds ERB to allow for dynamic configuration. If any configuration options were removed, it was because they were either deprecated in recent versions of Systemd or replaced with configurable alternatives. For example, Environment=MALLOC_ARENA_MAX=2 was removed because it’s now configurable.

Regarding Monit, I agree it should likely be removed, as it seems to be unused and is no longer documented. If there are lingering parts, we should clean those up as well.

To help resolve this, if you can commit a use case of your setup in the other repo, I’d be happy to assist in fixing or cleaning up the gem to better support your needs.

@jclusso
Copy link
Contributor Author

jclusso commented Jan 24, 2025

Based on what you're saying, I think the first step is to clean the gem up before we add stuff back in. I can make a PR for some stuff to remove later or tomorrow. I'll also see if I can get to making a template app, but I rather try and put some time towards cleaning up.

As for the Sidekiq config, I noticed it was largely different. To make it easy to keep up to date with Sidekiq's changes, I think it would be ideal to leave all the comments and stuff in. This also allows others to potentially modify the template for their own use case if they need to.

Looking forward to getting this gem back on track!

@jclusso
Copy link
Contributor Author

jclusso commented Jan 24, 2025

The current template includes all the settings from the official Sidekiq template(correct me if I'm wrong), minus the comments, and adds ERB to allow for dynamic configuration. If any configuration options were removed, it was because they were either deprecated in recent versions of Systemd or replaced with configurable alternatives. For example, Environment=MALLOC_ARENA_MAX=2 was removed because it’s now configurable.

As I look at the template more, I see why some comments could be considered irrelevant. The configurability of Environment is a nice feature, but I wonder if we should default to including MALLOC_ARENA_MAX=2 if nothing is set. The way it currently is, out of the box most probably wouldn't set it which would have negative implications.

@seuros
Copy link
Owner

seuros commented Jan 24, 2025

We could definitely add MALLOC_ARENA_MAX=2 as a default. I initially removed it because my setup involves dedicated machines where these environment variables are configured globally. However, I agree that for most users, having it as a default would be good to avoid potential performance issues out of the box.
Let’s include it as a default unless explicitly overridden.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants