Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Tips for Debugging C Extensions #35100

Merged
merged 16 commits into from
Dec 10, 2020
Merged
51 changes: 51 additions & 0 deletions doc/source/development/debugging_extensions.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
.. _debugging_c_extensions:

{{ header }}

**********************
Debugging C extensions
**********************

Pandas uses select C extensions for high performance IO operations. In case you need to debug segfaults or general issues with those extensions, the following steps may be helpful. These steps are geared towards using lldb as a debugger, though the steps for gdb will be similar.

First, be sure to compile the extensions with the appropriate flags to generate debug symbols and remove optimizations. This can be achieved as follows:

.. code-block:: sh

python setup.py build_ext --inplace -j4 --with-debugging-symbols

Next you can create a script that hits the extension module you are looking to debug and place it in the project root. Thereafter launch a Python process under lldb:

.. code-block:: sh

lldb python

If desired, set breakpoints at various file locations using the below syntax:

.. code-block:: sh

breakpoint set --file pandas/_libs/src/ujson/python/objToJSON.c --line 1547

At this point you may get *WARNING: Unable to resolve breakpoint to any actual locations.*. If you have not yet executed anything it is possible that this module has not been loaded into memory, which is why the location cannot be resolved. You can simply ignore for now as it will bind when we actually execute code.

Finally go ahead and execute your script:

.. code-block:: sh

run <the_script>.py
WillAyd marked this conversation as resolved.
Show resolved Hide resolved

Code execution will halt at the breakpoint defined or at the occurance of any segfault. LLDB's `GDB to LLDB command map <https://lldb.llvm.org/use/map.html>`_ provides a listing of debugger command that you can execute using either debugger.

Another option to execute the entire test suite under the debugger would be to run the following:

.. code-block:: sh

lldb -- python -m pytest

Or for gdb

.. code-block:: sh

gdb --args python -m pytest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ gdb --args python3 -m pytest
[...]
"~/.pyenv/shims/python3": not in executable format: File format not recognized
(gdb) run
Starting program:  -m pytest
No executable file specified.
Use the "file" or "exec-file" command.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try gdb -ex r --args python3 -m pytest? Taking that from this link:

https://wiki.python.org/moin/DebuggingWithGdb

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea looks like gdb -ex r --args python3 -m pytest pandas/tests was working for me if you want to try on yours and confirm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ gdb -ex r --args python3 -m pytest pandas/tests
[...]
"~/.pyenv/shims/python3": not in executable format: File format not recognized
Starting program:  -m pytest pandas/tests
No executable file specified.
Use the "file" or "exec-file" command.
(gdb) 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using pyenv for development or Conda?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you've already gone above and beyond helping me debug this; ill spend some more time on this and ping you if i find anything new

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good - I think this is generally helpful to hash out together so thanks for the input.

It looks like this might be specific to pyenv and how it manages the python executable:

https://stackoverflow.com/questions/48141135/cannot-start-dbg-on-my-python-c-extension

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gdb -ex r --args bash pytest pandas/tests --skip-slow --skip-db tentatively looks like a winner

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the command from my previous comment specific to my case, or relevant to the document?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to avoid adding too much detail here since this issue is more of a pyenv thing than a debugger issue


Once the process launches, simply type ``run`` and the test suite will begin, stopping at any segmentation fault that may occur.
1 change: 1 addition & 0 deletions doc/source/development/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Development
code_style
maintaining
internals
debugging_extensions
extending
developer
policies
Expand Down
20 changes: 12 additions & 8 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -414,18 +414,16 @@ def run(self):

# ----------------------------------------------------------------------
# Preparation of compiler arguments

debugging_symbols_requested = "--with-debugging-symbols" in sys.argv
if debugging_symbols_requested:
sys.argv.remove("--with-debugging-symbols")


if sys.byteorder == "big":
endian_macro = [("__BIG_ENDIAN__", "1")]
else:
endian_macro = [("__LITTLE_ENDIAN__", "1")]


debugging_symbols_requested = "--with-debugging-symbols" in sys.argv
if debugging_symbols_requested:
sys.argv.remove("--with-debugging-symbols")

if is_platform_windows():
extra_compile_args = []
extra_link_args = []
Expand All @@ -435,8 +433,14 @@ def run(self):
else:
extra_compile_args = ["-Werror"]
extra_link_args = []
if debugging_symbols_requested:
extra_compile_args.append("-g")
if not debugging_symbols_requested:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess Python by default (at least locally and looking at some of the CI builds) includes the -g flag as part of the CFLAGS that distutils uses to compile extensions, so in the current setup this does nothing.

According to SO we can override that by appending here, which might help reduce file size by removing those symbols:

https://stackoverflow.com/a/37952343/621736

I can also remove this from this PR if deemed too orthogonal. IIRC @xhochy or @TomAugspurger may have experience with stripping debug symbols from built distributions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multibuild may do this by default now? I don't recall.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, multibuild includes this nowadays.

# Strip debugging symbols (included by default)
extra_compile_args.append("-g0")
else:
# TODO: these should override the defaults provided by Python
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

distutils adds NDEBUG and -O3 by default it seems without a feasible way to remove those compilation flags. Appending these at the end should override those according to the SO link shared above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recommend building with -O0 actually as it hides a lot of problems. In case of -O0 memory is mored often zeroed and thus invalid memory access is not that fatal (in the higher -Ox cases you would get random data). This makes the detection of bugs a lot harder. It might be better to keep the optimization level here and add flags that make debugging easier. Personally I like -ggdb -fno-omit-frame-pointer. This gives better stacktraces and a bit more debugging information.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gdb suggests turning off optimizations:

https://sourceware.org/gdb/onlinedocs/gdb/Optimized-Code.html

There are certainly exceptions but I think as a general rule (especially for people that aren't super well versed in debugging the extensions yet) that no optimizations will be easier to follow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the debug information with -O0 is definitely better, I would like to leave -fno-omit-frame-pointer somewhere here in the document as this is the option that provides the most debug information for the smallest performance hit and is really useful if you have an memory-out-of-bounds issue that wouldn't occur easily in the unoptimized version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I think this is off by default with -O0 per the docs but doesn't hurt to add again

https://gcc.gnu.org/onlinedocs/gcc-3.4.4/gcc/Optimize-Options.html

# by being appended to end, but would ideally replace altogether
extra_compile_args.append("-UNDEBUG")
extra_compile_args.append("-O0")

# Build for at least macOS 10.9 when compiling on a 10.9 system or above,
# overriding CPython distuitls behaviour which is to target the version that
Expand Down