Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mprof each child process independently #118

Closed
bbengfort opened this issue Jul 14, 2016 · 13 comments
Closed

mprof each child process independently #118

bbengfort opened this issue Jul 14, 2016 · 13 comments

Comments

@bbengfort
Copy link
Contributor

Moved here as a feature request from the following SO question:

http://stackoverflow.com/questions/38358881/how-to-profile-multiple-subprocesses-using-python-multiprocessing-and-memory-pro

The mprof script allows you to track memory usage of a process over time, and includes a -C flag which will also sum up the memory usage of all child processes (forks) spawned by the primary process.

Instead of summation, I would like the mprof script to include a flag that will identify each process by pid in the generated .dat file, allowing the plot command to visualize each process' memory usage independently of each other, over time.

@bbengfort
Copy link
Contributor Author

I've created a proof of concept in my fork thanks to @fabianp pointing out the _get_memory function. I've taken out the parts of this function that summed the child memory to the main process using psutil, and moved it to its own function, _get_child_memory which you can see on L76. This is called from _get_memory on L119.

The problem is that I don't know where I should adapt the memory_usage function to include children. I was thinking of doing logging similar to how the FUNC objects are logged. I created a proof of concept of how this would work in another executable script mpmprof. The while loop in this file is an adaptation of the loop on L297 in memory_usage specifically for a Popen object. It changes how it writes to mprofile_ dat files to include child processes as follows:

CMDLINE python examples/multiprocessing_example.py
MEM 0.207031 1468627462.8746
MEM 7.730469 1468627462.9888
CHLD0 17.343750 1468627462.9992
CHLD1 86.722656 1468627462.9992
CHLD2 17.359375 1468627462.9992
CHLD3 84.859375 1468627462.9992
MEM 7.730469 1468627463.0995
CHLD0 17.343750 1468627463.1084
CHLD1 154.644531 1468627463.1084

I then also created an adapted plot function to visualize this new mprofile file, which produces results as follows from profiling the examples/multiprocessing_example.py:

multiprocessing_example

I'm not sure if there are any tests for this - but make test seems to work for the changes I made. I guess the question now is how should we integrate this work into the primary memory_profiler API, if at all.

@fabianp
Copy link
Collaborator

fabianp commented Jul 17, 2016

Hi, Thanks for looking into this. Could you make a pull request so I can see the diff and comment on your code? Thanks

@bbengfort
Copy link
Contributor Author

@fabianp - the only diff is a small function in memory_profiler; otherwise I put the proof of concept in a new file called mpmprof -- are you sure you want me to pull request this?

@fabianp
Copy link
Collaborator

fabianp commented Jul 18, 2016

Ideally it would be a switch (like --include-children) for mprof that prints the different children instead of adding them. Do you think this is possible?

@bbengfort
Copy link
Contributor Author

I can certainly look into doing this, but I wanted to avoid that as a start because I'd have to make changes at so many points in the original code. I'm pretty busy these past few weeks, but hopefully I can get a chance to spin back to this next week.

@cachedout
Copy link

👍 on this. I need this ability quite badly as well and would be willing to help out with development.

@davidgbe
Copy link

Did this ever get resolved?

@fabianp
Copy link
Collaborator

fabianp commented Mar 18, 2017

No. I'm happy to include it if someone sends me a pull request but I won't work on it for the moment.

@bbengfort
Copy link
Contributor Author

Sorry - I just haven't had a chance to get into it. The proof of concept is still there, let me see if I can get a pull request together real quick.

@petroslamb
Copy link

Apologies for the direct approach, I wanted to point out a possible misreport on the the total memory usage when multiprocessing.

My thought was that I could save some time from everyone by using this open issue. I can go into more details or a new issue, if this does not suffice.

My report is on https://github.com/fabianp/memory_profiler/blob/master/memory_profiler.py#L146

I added a comment on the MR that seems to introduce it: https://github.com/fabianp/memory_profiler/pull/134#pullrequestreview-43443387

Thanks

@bbengfort
Copy link
Contributor Author

My thought is that this is a new issue and potentially doesn't matter to being able to plot things independently; rather if there is an issue it needs to be fixed in _get_child_memory() and this change will ensure that the plotting functions work as expected.

@fabianp what do you think; can we close this and start a new issue?

@fabianp
Copy link
Collaborator

fabianp commented Jun 12, 2017 via email

@bbengfort
Copy link
Contributor Author

@petroslamb I'm going to go ahead and close this issue; do you want to create the new issue regarding the memory reporting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants