Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Independent child process monitoring #118 #134

Merged
merged 9 commits into from
Mar 21, 2017
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,6 @@ MANIFEST
*.egg-info
*.pyc
*~

# Ignore mprof generated files
mprofile_*.dat
51 changes: 43 additions & 8 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,10 +107,10 @@ decorator function. Use as follows::
del b
return a

If a python script with decorator ``@profile`` is called using ``-m
If a python script with decorator ``@profile`` is called using ``-m
memory_profiler`` in the command line, the ``precision`` parameter is ignored.

Time-based memory usage
Time-based memory usage
==========================
Sometimes it is useful to have full memory usage reports as a function of
time (not line-by-line) of external processes (be it Python scripts or not).
Expand All @@ -131,14 +131,14 @@ e.g. `mprof run -h`.
In the case of a Python script, using the previous command does not
give you any information on which function is executed at a given
time. Depending on the case, it can be difficult to identify the part
of the code that is causing the highest memory usage.
of the code that is causing the highest memory usage.

Adding the `profile` decorator to a function and running the Python
script with
script with

mprof run <script>

will record timestamps when entering/leaving the profiled function. Runnning
will record timestamps when entering/leaving the profiled function. Running

mprof plot

Expand All @@ -152,16 +152,49 @@ A discussion of these capabilities can be found `here <http://fa.bianp.net/blog/

.. warning:: If your Python file imports the memory profiler `from memory_profiler import profile` these timestamps will not be recorded. Comment out the import, leave your functions decorated, and re-run.

The available commands for `mprof` are:
The available commands for `mprof` are:

- ``mprof run``: running an executable, recording memory usage
- ``mprof run``: running an executable, recording memory usage
- ``mprof plot``: plotting one the recorded memory usage (by default,
the last one)
- ``mprof list``: listing all recorded memory usage files in a
user-friendly way.
- ``mprof clean``: removing all recorded memory usage files.
- ``mprof rm``: removing specific recorded memory usage files

Tracking forked child processes
===============================
In a multiprocessing context the main process will spawn child processes whose
system resources are allocated separately from the parent process. This can
lead to an inaccurate report of memory usage since by default only the parent
process is being tracked. The ``mprof`` utility provides two mechanisms to
track the usage of child processes: sum the memory of all children to the
parent's usage and track each child individual.

To create a report that combines memory usage of all the children and the
parent, use the ``include_children`` flag in either the ``profile`` decorator or
ass a command line argument to ``mprof``::

mprof run --include-children <script>

The second method tracks each child independently of the main process,
serializing child rows by index to the output stream. Use the ``multiprocess``
flag and plot as follows::

mprof run --multiprocess <script>
mprof plot

This will create a plot using matplotlib similar to this:

.. image:: https://cloud.githubusercontent.com/assets/745966/24075879/2e85b43a-0bfa-11e7-8dfe-654320dbd2ce.png
:target: https://github.com/fabianp/memory_profiler/pull/134
:height: 350px

You can combine both the ``include_children`` and ``multiprocess`` flags to show
the total memory of the program as well as each child individually.

.. warning:: Currently the child tracking only works if a ``stream`` is provided to the ``profile`` (e.g. from the command line or in the decorator). If you are using the API to retrieve values then the flag will not do anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard do you think it would be to remove this limitation?, i.e., to track children even if the function is not @Profile'd

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused, as the example works fine even without decorated functions

Copy link
Contributor Author

@bbengfort bbengfort Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that the stream argument has to be passed in, if stream = None then I can't return multiple values. I could if you wanted me to return a list instead of a single float. I didn't necessarily mean it wouldn't work with the decorator.

Perhaps I could rephrase as follows:

"Currently tracking individual children requires a reporting output stream; if you'd like direct access to the memory usage of individual children, see the _get_child_memory function."

Though of course, that then points the user to an internal function. I'm open to suggestions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put placeholder comments in the code (line 363 and 403) where this issue occurs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for the explanation. I think the API would be simpler if we allow memory_usage to return a nested list instead embedding this into the stream. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've updated memory_usage to return a nested list and updated the README accordingly.

Setting debugger breakpoints
=============================
It is possible to set breakpoints depending on the amount of memory used.
Expand Down Expand Up @@ -260,7 +293,7 @@ LogFile of memory profiler module.
>>> import sys
>>> sys.stdout = LogFile('memory_profile_log')

``Customised reporting:``
``Customized reporting:``

Sending everything to the log file while running the memory_profiler
could be cumbersome and one can choose only entries with increments
Expand Down Expand Up @@ -412,6 +445,8 @@ cleanup.

`Dmitriy Novozhilov <https://github.com/demiurg906>`_ and `Sergei Lebedev <https://github.com/superbobry>`_ added support for `tracemalloc <https://docs.python.org/3/library/tracemalloc.html>`_.

`Benjamin Bengfort <https://github.com/bbengfort>`_ added support for tracking the usage of individual child processes and plotting them.

=========
License
=========
Expand Down
55 changes: 55 additions & 0 deletions examples/multiprocessing_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
"""
An undecorated example of a script that allocates memory in multiprocessing
workers to demonstrate the use of memory_profiler with multiple processes.

Run this script with mprof run -M python multiprocessing_example.py
You can then visualize the usage with mprof plot.
"""

import time
import multiprocessing as mp

# Big numbers
X6 = 10 ** 6
X7 = 10 ** 7


def worker(num, wait, amt=X6):
"""
A function that allocates memory over time.
"""
frame = []

for idx in range(num):
frame.extend([1] * amt)
time.sleep(wait)

del frame


def main_sequential():
"""
A sequential version of the work, where one worker is called at a time.
"""
worker(5, 5, X6)
worker(5, 2, X7)
worker(5, 5, X6)
worker(5, 2, X7)


def main_multiproc():
"""
A multiprocessing version of the work, where workers work in their own
child processes and are collected by the master process.
"""
pool = mp.Pool(processes=4)
tasks = [
pool.apply_async(worker, args) for args in
[(5, 5, X6), (5, 2, X7), (5, 5, X6), (5, 2, X7)]
]

results = [p.get() for p in tasks]


if __name__ == '__main__':
main_multiproc()
73 changes: 62 additions & 11 deletions memory_profiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,38 @@ def _repr_pretty_(self, p, cycle):
p.text(u'<MemitResult : ' + msg + u'>')


def _get_child_memory(process, meminfo_attr=None):
"""
Returns a generator that yields memory for all child processes.
"""
if not has_psutil:
raise NotImplementedError((
"The psutil module is required to monitor the "
"memory usage of child processes."
))

# Convert a pid to a process
if isinstance(process, int):
if process == -1: process = os.getpid()
process = psutil.Process(process)

if not meminfo_attr:
# Use the psutil 2.0 attr if the older version isn't passed in.
meminfo_attr = 'memory_info' if hasattr(process, 'memory_info') else 'get_memory_info'

# Select the psutil function get the children similar to how we selected
# the memory_info attr (a change from excepting the AttributeError).
children_attr = 'children' if hasattr(process, 'children') else 'get_children'

# Loop over the child processes and yield their memory
try:
for child in getattr(process, children_attr)(recursive=True):
yield getattr(child, meminfo_attr)()[0] / _TWO_20
except psutil.NoSuchProcess:
# https://github.com/fabianp/memory_profiler/issues/71
yield 0.0


def _get_memory(pid, backend, timestamps=False, include_children=False, filename=None):
# .. low function to get memory consumption ..
if pid == -1:
Expand All @@ -111,12 +143,7 @@ def ps_util_tool():
else 'get_memory_info'
mem = getattr(process, meminfo_attr)()[0] / _TWO_20
if include_children:
try:
for p in process.children(recursive=True):
mem += getattr(p, meminfo_attr)()[0] / _TWO_20
except psutil.NoSuchProcess:
# https://github.com/fabianp/memory_profiler/issues/71
pass
mem += sum(_get_child_memory(process, meminfo_attr))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, quick question, as processes in modern linux share common memory after forking from their parent and only copy it on writes, wouldn't this sum report the wrong number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I'm not sure of the behavior of psutil with regards to linux forks; I simply used the same meminfo_attr as the summation function. If that's true then this is indeed a problem, but I'd guess that we'd have to dig into psutil to figure it out.

if timestamps:
return mem, time.time()
else:
Expand All @@ -128,9 +155,11 @@ def ps_util_tool():
def posix_tool():
# .. scary stuff ..
if include_children:
raise NotImplementedError('The psutil module is required when to'
' monitor memory usage of children'
' processes')
raise NotImplementedError((
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about this, I think this was just a carry over from the historical merge; I can reset back to the original if needed.

"The psutil module is required to monitor the "
"memory usage of child processes."
))

warnings.warn("psutil module not found. memory_profiler will be slow")
# ..
# .. memory usage in MiB ..
Expand Down Expand Up @@ -211,8 +240,8 @@ def run(self):


def memory_usage(proc=-1, interval=.1, timeout=None, timestamps=False,
include_children=False, max_usage=False, retval=False,
stream=None, backend=None):
include_children=False, multiprocess=False, max_usage=False,
retval=False, stream=None, backend=None):
"""
Return the memory usage of a process or piece of code

Expand Down Expand Up @@ -243,6 +272,12 @@ def memory_usage(proc=-1, interval=.1, timeout=None, timestamps=False,
timestamps : bool, optional
if True, timestamps of memory usage measurement are collected as well.

include_children : bool, optional
if True, sum the memory of all forked processes as well

multiprocess : bool, optional
if True, track the memory usage of all forked processes.

stream : File
if stream is a File opened with write access, then results are written
to this file instead of stored in memory and returned at the end of
Expand Down Expand Up @@ -314,10 +349,18 @@ def memory_usage(proc=-1, interval=.1, timeout=None, timestamps=False,
mem_usage = _get_memory(
proc.pid, backend, timestamps=timestamps,
include_children=include_children)

if stream is not None:
stream.write("MEM {0:.6f} {1:.4f}\n".format(*mem_usage))

# Only write children to the stream file, warn if appending to the return.
if multiprocess:
for idx, chldmem in enumerate(_get_child_memory(proc.pid)):
stream.write("CHLD {0} {1:.6f} {2:.4f}\n".format(idx, chldmem, time.time()))
else:
ret.append(mem_usage)
if multiprocess:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the first place where I don't return multiple values and instead warn

warnings.warn("use include_children not multiprocess without a stream")
else:
ret = max(ret,
_get_memory(
Expand Down Expand Up @@ -348,8 +391,16 @@ def memory_usage(proc=-1, interval=.1, timeout=None, timestamps=False,
include_children=include_children)
if stream is not None:
stream.write("MEM {0:.6f} {1:.4f}\n".format(*mem_usage))

# Only write children to the stream file, warn if appending to the return.
if multiprocess:
for idx, chldmem in enumerate(_get_child_memory(proc.pid)):
stream.write("CHLD {0} {1:.6f} {2:.4f}\n".format(idx, chldmem, time.time()))
else:
ret.append(mem_usage)

if multiprocess:
warnings.warn("use include_children not multiprocess without a stream")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the second place where I don't return multiple values and instead warn

else:
ret = max([ret,
_get_memory(proc, backend, include_children=include_children)
Expand Down
42 changes: 39 additions & 3 deletions mprof
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import copy
import time
import math

from collections import defaultdict
from optparse import OptionParser, OptionValueError

import memory_profiler as mp
Expand Down Expand Up @@ -195,6 +196,9 @@ def run_action():
parser.add_option("--include-children", "-C", dest="include_children",
default=False, action="store_true",
help="""Monitors forked processes as well (sum up all process memory)""")
parser.add_option("--multiprocess", "-M", dest="multiprocess",
default=False, action="store_true",
help="""Monitors forked processes creating individual plots for each child""")

(options, args) = parser.parse_args()

Expand Down Expand Up @@ -231,7 +235,8 @@ def run_action():
with open(mprofile_output, "a") as f:
f.write("CMDLINE {0}\n".format(cmd_line))
mp.memory_usage(proc=p, interval=options.interval, timestamps=True,
include_children=options.include_children, stream=f)
include_children=options.include_children,
multiprocess=options.multiprocess, stream=f)


def add_brackets(xloc, yloc, xshift=0, color="r", label=None, options=None):
Expand Down Expand Up @@ -299,6 +304,7 @@ def read_mprofile_file(filename):
func_ts = {}
mem_usage = []
timestamp = []
children = defaultdict(list)
cmd_line = None
f = open(filename, "r")
for l in f:
Expand All @@ -319,6 +325,13 @@ def read_mprofile_file(filename):
float(mem_start), float(mem_end)])
func_ts[f_name] = ts

elif field == "CHLD":
values = value.split(' ')
chldnum = values[0]
children[chldnum].append(
(float(values[1]), float(values[2]))
)

elif field == "CMDLINE":
cmd_line = value
else:
Expand All @@ -327,10 +340,10 @@ def read_mprofile_file(filename):

return {"mem_usage": mem_usage, "timestamp": timestamp,
"func_timestamp": func_ts, 'filename': filename,
'cmd_line': cmd_line}
'cmd_line': cmd_line, 'children': children}


def plot_file(filename, index=0, timestamps=True, options=None):
def plot_file(filename, index=0, timestamps=True, children=True, options=None):
try:
import pylab as pl
except ImportError:
Expand All @@ -351,6 +364,7 @@ def plot_file(filename, index=0, timestamps=True, options=None):
ts = mprofile['func_timestamp']
t = mprofile['timestamp']
mem = mprofile['mem_usage']
chld = mprofile['children']

if len(ts) > 0:
for values in ts.values():
Expand Down Expand Up @@ -384,6 +398,28 @@ def plot_file(filename, index=0, timestamps=True, options=None):
bottom += 0.001
top -= 0.001

# plot children, if any
if len(chld) > 0 and children:
cmpoint = (0,0) # maximal child memory

for idx, (proc, data) in enumerate(chld.items()):
# Create the numpy arrays from the series data
cts = np.asarray([item[1] for item in data]) - global_start
cmem = np.asarray([item[0] for item in data])

# Plot the line to the figure
pl.plot(cts, cmem, "+-" + mem_line_colors[idx+1 % len(mem_line_colors)],
label="child {}".format(proc))

# Detect the maximal child memory point
cmax_mem = cmem.max()
if cmax_mem > cmpoint[1]:
cmpoint = (cts[cmem.argmax()], cmax_mem)

# Add the marker lines for the maximal child memory usage
pl.vlines(cmpoint[0], pl.ylim()[0]+0.001, pl.ylim()[1] - 0.001, 'r', '--')
pl.hlines(cmpoint[1], pl.xlim()[0]+0.001, pl.xlim()[1] - 0.001, 'r', '--')

# plot timestamps, if any
if len(ts) > 0 and timestamps:
func_num = 0
Expand Down