Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Profiling #15866

Merged
merged 21 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/cudf/source/user_guide/api_docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,4 @@ This page provides a list of all publicly accessible modules, methods and classe
options
extension_dtypes
pylibcudf/index.rst
performance_tracking
12 changes: 12 additions & 0 deletions docs/cudf/source/user_guide/api_docs/performance_tracking.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. _api.performance_tracking:

====================
Performance Tracking
====================

.. currentmodule:: cudf.utils.performance_tracking
.. autosummary::
:toctree: api/

get_memory_records
print_memory_report
1 change: 1 addition & 0 deletions docs/cudf/source/user_guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@ options
performance-comparisons/index
PandasCompat
copy-on-write
memory-profiling
pandas-2.0-breaking-changes
```
44 changes: 44 additions & 0 deletions docs/cudf/source/user_guide/memory-profiling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
(memory-profiling-user-doc)=

# Memory Profiling

Peak memory usage is a common concern in GPU programming because GPU memory is typically smaller than available CPU memory. To easily identify memory hotspots, cuDF provides a memory profiler. It comes with an overhead so avoid using it in performance-sensitive code.

## Enabling Memory Profiling

First, enable memory profiling in RMM by calling {py:func}`rmm.statistics.enable_statistics()`. This adds a statistics resource adaptor to the current RMM memory resource, which enables cuDF to access memory profiling information. See the [RMM documentation](https://docs.rapids.ai/api/rmm/stable/guide/#memory-statistics-and-profiling) for more details.

Second, enable memory profiling in cuDF by setting the `memory_profiling` option to `True`. Use {py:func}`cudf.set_option` or set the environment variable ``CUDF_MEMORY_PROFILING=1`` prior to the launch of the Python interpreter.

To get the result of the profiling, use {py:func}`cudf.utils.performance_tracking.print_memory_report` or access the raw profiling data by using: {py:func}`cudf.utils.performance_tracking.get_memory_records`.

### Example
In the following, we enable profiling, do some work, and then print the profiling results:

```python
>>> import cudf
>>> from cudf.utils.performance_tracking import print_memory_report
>>> from rmm.statistics import enable_statistics
>>> enable_statistics()
>>> cudf.set_option("memory_profiling", True)
>>> cudf.DataFrame({"a": [1, 2, 3]}) # Some work
a
0 1
1 2
2 3
>>> print_memory_report() # Pretty print the result of the profiling
Memory Profiling
================

Legends:
ncalls - number of times the function or code block was called
memory_peak - peak memory allocated in function or code block (in bytes)
memory_total - total memory allocated in function or code block (in bytes)

Ordered by: memory_peak

ncalls memory_peak memory_total filename:lineno(function)
1 32 32 cudf/core/dataframe.py:690(DataFrame.__init__)
2 0 0 cudf/core/index.py:214(RangeIndex.__init__)
6 0 0 cudf/core/index.py:424(RangeIndex.__len__)
```
4 changes: 2 additions & 2 deletions python/cudf/cudf/core/buffer/spill_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,14 @@
import rmm.mr

from cudf.options import get_option
from cudf.utils.nvtx_annotation import _cudf_nvtx_annotate
from cudf.utils.performance_tracking import _performance_tracking
from cudf.utils.string import format_bytes

if TYPE_CHECKING:
from cudf.core.buffer.spillable_buffer import SpillableBufferOwner

_spill_cudf_nvtx_annotate = partial(
_cudf_nvtx_annotate, domain="cudf_python-spill"
_performance_tracking, domain="cudf_python-spill"
)


Expand Down
7 changes: 4 additions & 3 deletions python/cudf/cudf/core/buffer/spillable_buffer.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from typing import TYPE_CHECKING, Any, Literal

import numpy
import nvtx
from typing_extensions import Self

import rmm
Expand All @@ -21,7 +22,7 @@
host_memory_allocation,
)
from cudf.core.buffer.exposure_tracked_buffer import ExposureTrackedBuffer
from cudf.utils.nvtx_annotation import _get_color_for_nvtx, annotate
from cudf.utils.performance_tracking import _get_color_for_nvtx
from cudf.utils.string import format_bytes

if TYPE_CHECKING:
Expand Down Expand Up @@ -200,7 +201,7 @@ def spill(self, target: str = "cpu") -> None:
)

if (ptr_type, target) == ("gpu", "cpu"):
with annotate(
with nvtx.annotate(
message="SpillDtoH",
color=_get_color_for_nvtx("SpillDtoH"),
domain="cudf_python-spill",
Expand All @@ -218,7 +219,7 @@ def spill(self, target: str = "cpu") -> None:
# trigger a new call to this buffer's `spill()`.
# Therefore, it is important that spilling-on-demand doesn't
# try to unspill an already locked buffer!
with annotate(
with nvtx.annotate(
message="SpillHtoD",
color=_get_color_for_nvtx("SpillHtoD"),
domain="cudf_python-spill",
Expand Down
Loading
Loading