-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
firewalld modules reloads firewalld excessively #61
Comments
I don't think this is to do with the number of times it's restarted per-se, but what your test does is make firewall-cmd --reload commands happen at the time that Puppet is trying to send the report to the master. Running the first while loop in the background will probably cause a lot of failures, not just puppet. Reducing the number of reloads per run may reduce the risk, but maybe a better aproach would be a sleep after the reload to give firewalld a chance to finish reloading before puppet tries to move on? Are you seeing problems just on a normal puppet run or do you have to go the extreme (two while loops) in order to simulate the problem? |
This problem occurs during normal puppet runs if the firewall is being modified. The actual puppet errors are not usually an issue for my catalogs because puppet will just run again resources will apply once DNS lookups are working. If you do a sleep 5 between the firewalld reloads the same thing will still happen, it just doesn't happen nearly as often. The strange thing is that sometimes it takes 20 - 30 seconds before DNS lookups start working agin. The issue I am having right now is all of the custom services are created and then the services get added to the zone ~4 - 5 at the same time which will trigger 8 - 10 firewalld reloads within a few seconds just like in my example. Adding the sleep didn't actually seem to help much because the problem can still impact background processes (which is a bigger issue) even though most of the actual puppet errors went away. The biggest issue seen so far is that if tcserver happens to be starting at this same time and some of the applications it is starting requires one or more DNS lookups. This causes the application to not start correctly even though tcsever believes it is running correctly. Currently the only way I see to resolve this 100% of the time would be to add some puppet code to make sure tcserver starts after firewalld and possibly validate DNS lookups are working first. Reducing the number of restarts to the minimum required does help significantly reduce this occurrence. I am going to also open a ticket to one of the Linux vendors to determine if there is an actual issues with firewalld. |
Here is the response I received. I believe the module needs to reduce the number of times that --reload is called to a minimum based on this information.
|
The firewall should not be being reloaded more than once in a run unless you are managing a lot of services - can you post some --debug output showing when and how many times --reload is being called? |
I can work on the debug output. But here is the sample code (leaving out unnecessary stuff) for the example:
My current work around it to run puppet using tags that apply the firewall configuration before doing any other work. |
Custom services, followed by any other kind of resource will cause two reloads - I believe this was by design since if you add a custom service then try and use it later on in the same puppet run, it will fail because the custom service doesn't exist until after a reload (the same with What we can probably do quite easily is move the custom_service::reload exec to the base class, so there is only one of them, and execute it from there. That should improve things - I'll throw together a PR later for you to test... |
After some in depth testing I have found that if firewalld is restarted over and over again in a short period of time weird things start happening. The easiest to see issue is that DNS lookups start failing.
eg.. run 'firewall-cmd --reload' in a loop and also run 'puppet agent -t' in a loop.
while true;do firewall-cmd --reload; done
while true; do puppet agent -t; sleep 1; done
You will get errors like
Error: Could not send report: getaddrinfo: Name or service not known
I think the best solution will be to update all of the code doing reloads like custom_service and firewalld_service to common reloads reduce the number of restarts.
The text was updated successfully, but these errors were encountered: