Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Hepler Function uitils.attributes from puppet_x context to puppet #702

Closed
zilchms opened this issue Jun 2, 2022 · 32 comments
Closed
Assignees
Labels
Milestone

Comments

@zilchms
Copy link
Contributor

zilchms commented Jun 2, 2022

Release 3.3.0 introduces error:
Evaluation Error: Error while evaluating a Function Call, uninitialized constant Puppet::Pops::Loader::RubyFunctionInstantiator::PuppetX (file: /etc/puppetlabs/code/environments/production/modules/icinga2/manifests/config.pp, line: 19, column: 28)

Expected Behavior

Refactoring classes should not introduce errors. Maybe the testsuite didnt pickup on this? I am currently not sure where exactly this breaks, but my first guess would be manifest/config.pp line:19.

If icinga2::attributes($constants) would be used congruent to previous releases it should be attributes($constants)
(from previously: icinga2_attributes($constants)).

I think at least, didnt get to do any testing. But maybe I am overlooking something in my setup, thats causing this. Can someone confirm this behaviour?

Steps to Reproduce (for bugs)

Update icinga-icinga2 version from 3.2.2 to 3.3.0

Your Environment

  • Module version (puppet module list): icinga-icinga2 3.3.0, icinga-icinga 2.7.1
  • Puppet version (puppet -V): 6.27.0
  • Operating System and version: SLES15
@sircubbi
Copy link
Contributor

sircubbi commented Jun 8, 2022

FYI: saw the same issue, played a little bit with it and rolled back to icinga-icinga2 3.2.2 for the moment.

@lbetz
Copy link
Contributor

lbetz commented Jun 8, 2022

Shit, that wasn't planed. Sorry for that. Do you use environment caching? Did you try to clear the cache?

@sircubbi
Copy link
Contributor

sircubbi commented Jun 8, 2022

At least we have a lot of different environments and it is no easy task to update all environments at once. While fiddling with it our puppetmaster stopped working completly and since I hadn't any time to do some proper diagnostics I just rolled back for the moment. Maybe I can try again on the weekend, but if all environments need to have that change, it is quite of a hassle unfortunately.

@lbetz
Copy link
Contributor

lbetz commented Jun 8, 2022

I start to test it, now.

@lbetz
Copy link
Contributor

lbetz commented Jun 8, 2022

It does not seem to work with the switch to puppet function api 4.x without restarting the puppet server.

see branch fix/702

@zilchms
Copy link
Contributor Author

zilchms commented Jun 8, 2022

At first I did not restart the server service or cleared the cache, which resulted in the above mentioned error.
I did restart the puppetserver after updating the module to 3.3.0 now, cleared the cache and got a different error this time:

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Internal Server Error: org.jruby.exceptions.LoadError: (LoadError) no such file to load -- puppet_x/icinga2

I am not quite understanding what you are referring to in fix/702 (sry)

@lbetz
Copy link
Contributor

lbetz commented Jun 8, 2022

correct. This is definitely a bug, a missing require. The main problem is the remove of the function attributes from module Puppet::Icinga2 to PuppetX::Icinga.

I am working on this branch to solve the problem.

@lbetz
Copy link
Contributor

lbetz commented Jun 8, 2022

Shoud be solved.

@lbetz lbetz added the bug label Jun 8, 2022
@lbetz lbetz self-assigned this Jun 8, 2022
@lbetz lbetz added this to the v3.3.1 milestone Jun 8, 2022
@lbetz lbetz changed the title Refactor resulted in errors Refactor Hepler Function uitils.attributes from puppet_x context to puppet Jun 8, 2022
lbetz added a commit that referenced this issue Jun 8, 2022
@lbetz
Copy link
Contributor

lbetz commented Jun 8, 2022

If a update to 3.3.0 was done, this patch need a puppetserver restart. Otherwise updating from 3.2.2, no restart is required.

lbetz added a commit that referenced this issue Jun 8, 2022
@lbetz lbetz closed this as completed Jun 8, 2022
@zilchms
Copy link
Contributor Author

zilchms commented Jun 8, 2022

Thanks for the quick fix :)

@lbetz
Copy link
Contributor

lbetz commented Jun 9, 2022

You're welcome. Thx for the report.

@sircubbi
Copy link
Contributor

sircubbi commented Jun 9, 2022

Hmmm, unfortunately I am not very happy with that fix. While there isn't any more errors in the puppet run, it does have an huge impact on the performance here.
I updated from 3.2.2 to 3.3.1, cleared the environment cache, restarted the puppetmaster (actually did that several times, just to be sure). However running puppet on our icinga-master now takes ages to finish. Preparing the catalog on the puppetmaster takes around 8 minutes (while the puppetmaster consumes 100% CPU for all that time). Rolling back to the old version the prepare phase takes less than 40 seconds.

@zilchms
Copy link
Contributor Author

zilchms commented Jun 9, 2022

Is the runtime always 8 minutes or only when the catalog is not cached? Since for my setup, the puppetruns remain very fast (1,5 seconds) when loading the cached catalog.

@sircubbi
Copy link
Contributor

sircubbi commented Jun 9, 2022

Is the runtime always 8 minutes or only when the catalog is not cached? Since for my setup, the puppetruns remain very fast (1,5 seconds) when loading the cached catalog.

The runtime (and the high CPU-usage) stays that high with subsequent runs. Sure, our Icingamaster has a lot of resources that needs to be compiled, but it isn't really an issue with the old 3.2.2-version.

@lbetz
Copy link
Contributor

lbetz commented Jun 9, 2022

@sircubbi An update from 3.2.2 to 3.3.1 doesn't require a restart or a cache clearing.

@sircubbi
Copy link
Contributor

@sircubbi An update from 3.2.2 to 3.3.1 doesn't require a restart or a cache clearing.

Well, that wasn't clear from above comment, but ok. I did some fresh tests:

  1. Running the agent on our icinga2-master when 3.2.2 is installed:
Notice: Applied catalog in 54.50 seconds

real	1m43.075s
user	0m50.906s
sys	0m6.283s
  1. just replacing the icinga2-module on the puppetmaster (without any cache clearing or restart):
Notice: Applied catalog in 72.51 seconds

real	8m38.217s
user	0m51.753s
sys	0m6.675s
  1. making sure that it is no fluke I rerun the agent two more times:
2nd run 3.3.1:
Notice: Applied catalog in 105.26 seconds

real	9m3.684s
user	0m51.630s
sys	0m6.660s


3rd run 3.3.1:
Notice: Applied catalog in 74.73 seconds

real	8m33.051s
user	0m51.938s
sys	0m6.767s
  1. replacing the icinga2-module on the puppetmaster back to 3.2.2 (again without cache clearing or restart):
Notice: Applied catalog in 54.55 seconds

real	1m45.288s
user	0m51.316s
sys	0m6.548s

So something seems to affect the performance pretty badly.

@lbetz
Copy link
Contributor

lbetz commented Jun 10, 2022

Only something about functions was changed in this fix. Functions are evaluated on the Puppetserver, i.e. the catalog compile times would be affected here.

@sircubbi
Copy link
Contributor

Only something about functions was changed in this fix. Functions are evaluated on the Puppetserver, i.e. the catalog compile times would be affected here.

Yep, it absolutely is. As written: the CPU-usage goes up to 100% on the puppetmaster for several minutes (while the agent on the icinga2-server waits for the catalog). With 3.2.2 it takes the puppetmaster like 30-45 seconds to compile the catalog, while it takes around 8 minutes with 3.3.1).

@lbetz lbetz reopened this Jun 11, 2022
@lbetz
Copy link
Contributor

lbetz commented Jun 13, 2022

@sircubbi I have unsuccessfully tried to recreate this behavior. Do you use many extra icinga puppet objects?

@sircubbi
Copy link
Contributor

@sircubbi I have unsuccessfully tried to recreate this behavior. Do you use many extra icinga puppet objects?

Well, we actually have quite a few objects that are collected from all our hosts:

# puppet query 'resources[type] { type ~ "Icinga2" }' --urls http://127.0.0.1:8080 |grep type|sort|uniq -c|sort -n
      3     "type": "Icinga2::Object::Apiuser"
      4     "type": "Icinga2::Object::Checkcommand"
      4     "type": "Icinga2::Object::Scheduleddowntime"
      8     "type": "Icinga2::Object::Notification"
      8     "type": "Icinga2::Object::Usergroup"
     16     "type": "Icinga2::Object::User"
    100     "type": "Icinga2::Object::Host"
    292     "type": "Icinga2::Object::Endpoint"
    297     "type": "Icinga2::Feature"
    490     "type": "Icinga2::Object::Zone"
    961     "type": "Icinga2::Object::Service"
   1799     "type": "Icinga2::Object"
#

@lbetz
Copy link
Contributor

lbetz commented Jun 13, 2022

@sircubbi Do have any performance impacts on the side of the agents? They use the same code / function but without extra icinga puppet objects.

@sircubbi
Copy link
Contributor

@sircubbi Do have any performance impacts on the side of the agents? They use the same code / function but without extra icinga puppet objects.

At least not noticeable, no. All other agents seem to work fine, only the icingamaster itselfs takes really long for the agentrun while waiting on the catalog from the puppetmaster.

@lbetz
Copy link
Contributor

lbetz commented Jun 13, 2022

@sircubbi Strange. The objects above are exported resources, aren't they?

And from personal interest, why 490 zones and only 292 endpoints?

@sircubbi
Copy link
Contributor

@sircubbi Strange. The objects above are exported resources, aren't they?

And from personal interest, why 490 zones and only 292 endpoints?

Ah, I just realized that some resources are counted double in that query. I guess when looking at our icingamaster only the resources created by that node and all imported resources are important). So I guess a more cleaner picture would be this:

# puppet query 'resources[type] { type ~ "Icinga2" and (certname ~ "icinga" or exported = true) }' --urls http://127.0.0.1:8080|grep "type"|sort|uniq -c|sort -n
      3     "type": "Icinga2::Object::Apiuser"
      4     "type": "Icinga2::Object::Checkcommand"
      4     "type": "Icinga2::Object::Scheduleddowntime"
      6     "type": "Icinga2::Feature"
      8     "type": "Icinga2::Object::Notification"
      8     "type": "Icinga2::Object::Usergroup"
     16     "type": "Icinga2::Object::User"
     98     "type": "Icinga2::Object::Endpoint"
    100     "type": "Icinga2::Object::Host"
    102     "type": "Icinga2::Object::Zone"
    926     "type": "Icinga2::Object"
    961     "type": "Icinga2::Object::Service"
#

So I guess an agent-run on the icinga-master has to consider slightly more than 2000 different resources.

We have around 100 hosts that are collected. Most of the services are realized via apply-rules. However there are still some services which are defined per host and exported to the icingamaster.

@lbetz
Copy link
Contributor

lbetz commented Jun 13, 2022

Why this discrepancy between 3.2.2 and 3.3.1 occurs I do not know "yet".

But exported resources are very expensive and should be limited to hosts, zones and endpoints in the Icinga environment.
Services are to be assigned directly on the Icinga server itself using apply (no exported resources).

@sircubbi
Copy link
Contributor

But exported resources are very expensive and should be limited to hosts, zones and endpoints in the Icinga environment. Services are to be assigned directly on the Icinga server itself using apply (no exported resources).

Well, that is known and we mostly use apply for all of our services. There are some corner cases where it was easier to have a host just export a small set of "special"-services. The biggest part of the service-objects are actually no exported resources, but some virtual services that the icinga-server creates from our netbox-generated hierafile. So far it wasn't a big issue, as the puppetmaster took under a minute to compile the catalog and the icingamaster also well under a minute to apply it.

@lbetz
Copy link
Contributor

lbetz commented Jun 13, 2022

@sircubbi Now, I spent several hours and I think, I get something.

The new function calls in

https://github.com/Icinga/puppet-icinga2/blob/88e057065790b9b7621f48d0d902b628b00f580d/templates/object.conf.erb#L26-L32

increases my puppet run from 1:21 up to 1:45.

600 hosts (exported resources) and 2500 services (apply directly on icinga server)

@lbetz
Copy link
Contributor

lbetz commented Jun 13, 2022

Too bad, it won't be any faster with an EPP instead of ERB template.

@sircubbi
Copy link
Contributor

When looking at the historie that template always contained function-calls, why is it so slow with the new version?

@lbetz
Copy link
Contributor

lbetz commented Jun 27, 2022

@sircubbi Can you try branch fix/704-tuning?

Should be faster, not so fast like v3.2.2, but faster.

@lbetz lbetz closed this as completed Jun 27, 2022
@sircubbi
Copy link
Contributor

@sircubbi Can you try branch fix/704-tuning?

Should be faster, not so fast like v3.2.2, but faster.

I can confirm that fix/704-tuning fixes the performance issue for me. Actually I don't see much difference in the runtime compared to v3.2.2, so much better than with v3.3.1. Thanks.

@lbetz
Copy link
Contributor

lbetz commented Jun 27, 2022

@sircubbi thx. Release is coming...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants