Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python.d.plugin: use separate process for initial module checking #5552

Merged
merged 17 commits into from
Mar 7, 2019

Conversation

ilyam8
Copy link
Member

@ilyam8 ilyam8 commented Mar 5, 2019

Summary

This PR adds (major) changes only to python.d.plugin file.

Fixes: #5525

pyhton.d.plugin imports a lot of additional packages during initial module initialization/job creating/checking and there is no way to unimport them, even if they arn't needed. It consumes relatively a lot of ram.


Memory utilization comparing before/after the PR (one job example module, py3.7.2):

21.1 => 8.8 MiB

screenshot_20190305_111837

Component Name

collectors/python.d.plugin

Additional Information

This PR adds separate process for initial module checking.

Logic:

  • main process spawns checker process
  • checker process loads every module, loads module config, creates jobs and runs job.check() for every job, if check success it adds the job to the list.
  • checker process returns list of modules and jobs.
  • main process loads only active modules, etc.

@netdatabot netdatabot added the area/collectors Everything related to data collection label Mar 5, 2019
@netdata netdata deleted a comment Mar 5, 2019
@netdata netdata deleted a comment Mar 5, 2019
@netdata netdata deleted a comment Mar 5, 2019
@ilyam8
Copy link
Member Author

ilyam8 commented Mar 5, 2019

ok, i tested it with on Manjaro (py 3.7.2) and on Centos6(py 2.6.6), and it appears to be working correctly

@ilyam8 ilyam8 changed the title [wip] python.d.plugin: use separate process for initial module checking python.d.plugin: use separate process for initial module checking Mar 5, 2019
@ilyam8
Copy link
Member Author

ilyam8 commented Mar 5, 2019

Hey @Ferroin if you could find a time and test the PR a bit it would be very nice

@netdata netdata deleted a comment Mar 6, 2019
@netdata netdata deleted a comment Mar 6, 2019
@ilyam8
Copy link
Member Author

ilyam8 commented Mar 6, 2019

ok, i don't think we need to run a job if job.check() returns False in main. It can be error prone, instead we can treat all jobs in main as autodetection jobs, so if a job's check(), for some reason, succeed in child process and failed in main we will retry check again and again

@netdata netdata deleted a comment Mar 6, 2019
@ilyam8
Copy link
Member Author

ilyam8 commented Mar 6, 2019

@ktsaou if you are ok with the changes this is ready.

@netdata netdata deleted a comment Mar 6, 2019
@ilyam8
Copy link
Member Author

ilyam8 commented Mar 6, 2019

@cakrit please approve

@ilyam8 ilyam8 merged commit 2175673 into netdata:master Mar 7, 2019
@ilyam8
Copy link
Member Author

ilyam8 commented Mar 7, 2019

ok, here we go, i am on a high alert ⏰

@ilyam8 ilyam8 deleted the pythond_plugin_refactor branch March 12, 2019 13:48
jackyhuang85 pushed a commit to jackyhuang85/netdata that referenced this pull request Jan 1, 2020
…tdata#5552)

##### Summary

This PR adds (major) changes only to `python.d.plugin` file.

Fixes: netdata#5525

`pyhton.d.plugin` imports a lot of additional packages during initial module initialization/job creating/checking and there is no way to unimport them, even if they arn't needed. It consumes relatively a lot of ram.

___
Memory utilization comparing before/after the PR (one job `example` module, py3.7.2):

> 21.1 => 8.8 MiB

![screenshot_20190305_111837](https://user-images.githubusercontent.com/22274335/53791147-c27a6e00-3f39-11e9-8eaf-8ac3809a3b6e.png)


##### Component Name
[`collectors/python.d.plugin`](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/python.d.plugin.in)

##### Additional Information

This PR adds separate process for initial module checking. 

Logic:
 - main process spawns checker process
 - checker process loads every module, loads module config, creates jobs and runs job.check() for every job, if check success it adds the job to the list.
 - checker process returns list of modules and jobs.
 - main process loads only active modules, etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/collectors Everything related to data collection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants