-
Notifications
You must be signed in to change notification settings - Fork 465
Debugging
This page provides tips on debugging various portions of the bladeRF code base.
If you suspect you've encountered a bug that hasn't been reported yet, it is generally very helpful to include results of some preliminary debugging to an issue tracker item, forum post, or question on IRC. When reporting bugs, developers and community members may ask you to provide additional information, using some of the techniques described here.
In most cases, one will want to ensure debug symbols are enabled while debugging. This can be enabled by setting the CMake CMAKE_BUILD_TYPE variable to Debug:
$ cmake -DCMAKE_BUILD_TYPE=Debug ../
GCC/GDB users will likely want to include as much debug information as possible. Use ENABLE_GDB_EXTENSIONS to compile with -ggdb3:
$ cmake -DCMAKE_BUILD_TYPE=Debug -DENABLE_GDB_EXTENSIONS=Yes ../
To increase the verbosity of libbladeRF's output, add bladerf_log_set_verbosity(BLADERF_LOG_LEVEL_VERBOSE)
early in your program. See the bladerf_log_level enumeration for other possible log levels.
When dealing with a crashing application, one generally wants to identify:
- The state of the application at the time the failure occurred, including what the program was currently executing, and the state of relevant variables.
- The origin of any incorrect values, such as when a pointer was assigned a NULL value.
When possible, try to reproduce failures with the bladeRF-cli or a simple test applications to remove some degree of complexity from the situation, as well as narrow the scope of where defects reside.
If this is not possible, or the failure appears to be associated with the gr-osmosdr bladeRF support, it is recommended that you review the GNU Radio documentation on debugging with gdb first.
To quote the GDB documentation, A backtrace is a summary of how your program got to where it is. Backtraces in GDB show the current stack frame followed by the calling stack frames.
The below examples shows a snippet from a simple debug session of the bladeRF-cli, where an invalid loop bound defect was introduced to result in a segfault. In this example, we collect a backtrace and inspect the state of a function 2 stack frames prior to location where the segfault occurred.
We first start by running the program (bladeRF-cli, in this case) in GDB. Note that we tell GDB to interpret the arguments following the program name as arguments to that program, rather than to gdb, via --args.
$ gdb --args bladeRF-cli -p ... GDB copyright text ... Reading symbols from /usr/local/bin/bladeRF-cli...done. (gdb) run Starting program: /usr/local/bin/bladeRF-cli -p [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff6d8c700 (LWP 4774)] [New Thread 0x7ffff658b700 (LWP 4775)] [New Thread 0x7ffff5d8a700 (LWP 4776)] [INFO] Found FX3 bootloader device on bus=2 addr=4. This may be a bladeRF. [INFO] Use bladeRF-cli command "recover 2 4 <FX3 firmware>" to boot the bladeRF firmware. Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7574b28 in libusb_get_device_descriptor () from /lib/x86_64-linux-gnu/libusb-1.0.so.0
Upon running the program, we see it terminate (crash) with a segmentation fault. Note that the location and name of the function that the crash occurred in is shown here -- this is an important piece of information to include an your findings.
If the function that the program has crashed in is called from many places in the code, one many not be able to glean much from this information.
The backtrace or bt command may be used to display the currently executing function and its callers. Here we can see the path of function calls taken, as well as the arguments to each function call, with line numbers.
(gdb) bt #0 0x00007ffff7574b28 in libusb_get_device_descriptor () from /lib/x86_64-linux-gnu/libusb-1.0.so.0 #1 0x00007ffff7bcfdd9 in lusb_device_is_bladerf (dev=0x0) at /home/jon/projects/bladeRF/host/libraries/libbladeRF/src/backend/libusb.c:265 #2 0x00007ffff7bd349c in lusb_probe (info_list=0x7fffffffe210) at /home/jon/projects/bladeRF/host/libraries/libbladeRF/src/backend/libusb.c:2003 #3 0x00007ffff7bc8026 in backend_probe (devinfo_items=0x7fffffffe268, num_items=0x7fffffffe260) at /home/jon/projects/bladeRF/host/libraries/libbladeRF/src/backend.c:97 #4 0x00007ffff7bc80bf in bladerf_get_device_list (devices=0x7fffffffe2a8) at /home/jon/projects/bladeRF/host/libraries/libbladeRF/src/bladerf.c:49 #5 0x0000000000409170 in cmd_probe (s=0x618010, argc=1, argv=0x6186c0) at /home/jon/projects/bladeRF/host/utilities/bladeRF-cli/src/cmd/probe.c:41 #6 0x00000000004066a9 in cmd_handle (s=0x618010, line=0x40f83e "probe") at /home/jon/projects/bladeRF/host/utilities/bladeRF-cli/src/cmd/cmd.c:640 #7 0x0000000000405404 in main (argc=2, argv=0x7fffffffe458) at /home/jon/projects/bladeRF/host/utilities/bladeRF-cli/src/main.c:360 (gdb)
From the above backtrace, we can see how we got to the libusb_get_device_descriptor(). Note that in frame #1, lusb_device_is_bladerf() was called with a NULL argument, which is suspicious. To figure out why a NULL pointer was passed to this function, we'll need to inspect the state of the program in stack frame #2, inside lusb_probe()
The frame or f command may be used to select and print a stack frame:
(gdb) frame 2 #2 0x00007ffff7bd349c in lusb_probe (info_list=0x7fffffffe210) at /home/jon/projects/bladeRF/host/libraries/libbladeRF/src/backend/libusb.c:2003 2003 if( lusb_device_is_bladerf(list[i]) ) {
Now we can take a look at the state of the local variables in this function, and print out some additional information:
(gdb) info local status = 0 i = 13 n = 0 count = 13 list = 0x618ef0 info = {backend = BLADERF_BACKEND_LIBUSB, serial = "0000000004BE", '\000' <repeats 16 times>, "\320\341\377\377\377", usb_bus = 2 '\002', usb_addr = 4 '\004', instance = 4156353024} context = 0x618840 (gdb) print list[13] $2 = (libusb_device *) 0x0 (gdb) print list[12] $3 = (libusb_device *) 0x61a1c0 (gdb)
In this code, we see that a count local variable is 13. This is supposed to be used as an upper bound of a loop. However, we see that the variable i is 13, causing us to access an invalid list element and attempt to dereference a NULL pointer. (The valid elements are in list[0] through list[12]).
This section builds upon segfault example from the previous section.
Sometimes an invalid access may not cause a program to immediately (and conveniently) crash and burn. Instead, the problem many manifest itself as an intermittent issues where data is occasionally corrupted.
Valgrind tends to come in handy in this type of situation. Dr. Memory is a similar tool, which also supports 32-bit Windows applications.
Running the bladeRF-cli through valgrind on yields similar results to what we obtained with gdb in the previous section. However, it appears that list[13] didn't contain a value of 0x0 (it is likely that gdb had zeroed out memory for us), but rather a value of 0x68.
<
When trouble is aplenty,
It is possible to break in gdb at the first sign of trouble, using Valgrind's vgdb argument, which enables an embedded gdbserver.
To do... Finish this section
To do...
To do...