-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: add core_mutex_debug to aid debugging deadlocks #18620
Conversation
6c3d6e6
to
bf678a2
Compare
Nice idea!
|
Either |
@@ -194,6 +211,9 @@ void mutex_unlock(mutex_t *mutex) | |||
sched_change_priority(owner, mutex->owner_original_priority); | |||
} | |||
#endif | |||
#if IS_USED(MODULE_CORE_MUTEX_DEBUG) | |||
mutex->owner_calling_pc = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah no - if another thread was waiting on the mutex, it will be locked again here, but we don't have the PC of that one.
Not sure if
mutex->owner_calling_pc = 0; | |
mutex->owner_calling_pc = cpu_get_caller_pc(); |
would be any better of if we should just document that. (Or can we get the PC of the blocked thread?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed now. There is still a brief time window between clearing this and when the new owner updates this in which theoretically a higher priority thread could wake up and try to take the mutex and find a value of 0
here, but that should barely ever happen. I think for a debug print this is good enough and the issue is now documented.
528b477
to
8504570
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's certainly a handy addition.
8504570
to
e2d828e
Compare
Whoopsie, I forgot to push the fix and doc update I claimed I added in the comment above. |
Adding `USEMODULE += core_mutex_debug` to your `Makefile` results in on log messages such as [mutex] waiting for thread 1 (pc = 0x800024d) being added whenever `mutex_lock()` blocks. This makes tracing down deadlocks easier.
e2d828e
to
3e86e39
Compare
bors merge |
18620: core: add core_mutex_debug to aid debugging deadlocks r=benpicco a=maribu ### Contribution description Adding `USEMODULE += core_mutex_debug` to your `Makefile` results in on log messages such as [mutex] waiting for thread 1 (pc = 0x800024d) being added whenever `mutex_lock()` blocks. This makes tracing down deadlocks easier. ### Testing procedure Run e.g. ```sh USEMODULE=core_mutex_debug BOARD=nucleo-f767zi make -C tests/mutex_cancel flash test ``` which should provide output such as ``` Welcome to pyterm! Type '/exit' to exit. READY s [mutex] waiting for thread 1 (pc = 0x8000f35) START main(): This is RIOT! (Version: 2022.10-devel-841-g5cc02-core/mutex/debug) Test Application for mutex_cancel / mutex_lock_cancelable ========================================================= Test without cancellation: OK Test early cancellation: OK Verify no side effects on subsequent calls: [mutex] waiting for thread 1 (pc = 0x800024d) OK Test late cancellation: [mutex] waiting for thread 1 (pc = 0x0) OK TEST PASSED ``` ```sh $ arm-none-eabi-addr2line -a 0x800024d -e tests/mutex_cancel/bin/nucleo-f767zi/tests_mutex_cancel.elf 0x0800024d /home/maribu/Repos/software/RIOT/tests/mutex_cancel/main.c:51 ``` ### Issues/PRs references Depends on and includes #18619 19296: nanocoap: allow to define CoAP resources as XFA r=benpicco a=benpicco 19504: cpu/cc26xx_cc13xx: Fix bogus array-bound warning r=benpicco a=maribu ### Contribution description GCC 12 create a bogus array out of bounds warning as it assumes that because there is special handling for `uart == 0` and `uart == 1`, `uart` can indeed be `1`. There is an `assert(uart < UART_NUMOF)` above that would blow up prior to any out of bounds access. In any case, optimizing out the special handling of `uart == 1` for when `UART_NUMOF == 1` likely improves the generated code and fixes the warning. /home/maribu/Repos/software/RIOT/cc2650/cpu/cc26xx_cc13xx/periph/uart.c:88:8: error: array subscript 1 is above array bounds of 'uart_isr_ctx_t[1]' [-Werror=array-bounds] 88 | ctx[uart].rx_cb = rx_cb; | ~~~^~~~~~ /home/maribu/Repos/software/RIOT/cc2650/cpu/cc26xx_cc13xx/periph/uart.c:52:23: note: while referencing 'ctx' 52 | static uart_isr_ctx_t ctx[UART_NUMOF]; | ^~~ /home/maribu/Repos/software/RIOT/cc2650/cpu/cc26xx_cc13xx/periph/uart.c:89:8: error: array subscript 1 is above array bounds of 'uart_isr_ctx_t[1]' [-Werror=array-bounds] 89 | ctx[uart].arg = arg; | ~~~^~~~~~ /home/maribu/Repos/software/RIOT/cc2650/cpu/cc26xx_cc13xx/periph/uart.c:52:23: note: while referencing 'ctx' 52 | static uart_isr_ctx_t ctx[UART_NUMOF]; | ^~~ ### Testing procedure The actual change is a pretty obvious one-liner, so that code review and a green CI should be sufficient. If not, running any UART example app without regression should do. ### Issues/PRs references None 19506: tools/openocd: Fix handling of OPENOCD_CMD_RESET_HALT r=maribu a=maribu ### Contribution description The OPENOCD_CMD_RESET_HALT was not longer correctly passed to the script. This fixes the issue. ### Testing procedure Flashing of e.g. the `cc2650-launchpad` with upstream OpenOCD should work again. ### Issues/PRs references The change was added to #19050 after testing the PR and before merging. I'm not sure if the fix never worked because of this, or if behavior of `target-export-variables` or GNU Make changed. 19507: cpu/cc26x0_cc13x0: Drop feature cortexm_mpu r=benpicco a=maribu ### Contribution description At least the CC2650 doesn't have an MPU, I assume this is also true for the rest of the family. The CC2652 does have an MPU according to the datasheet. So I keep the feature there in place. ### Testing procedure E.g. ``` make BOARD=cc2650-launchpad -C tests/mpu_noexec_ram flash test ``` fails. (Note: A successful test run would also crash but with a mem manage handler rather than a hardfault due to an invalid instruction on the stack being executed.) It would be nice to also test the same for a `cc2652-launchpad`, for which the MPU should work. ### Issues/PRs references None Co-authored-by: Marian Buschsieweke <marian.buschsieweke@ovgu.de> Co-authored-by: Benjamin Valentin <benjamin.valentin@bht-berlin.de> Co-authored-by: Benjamin Valentin <benjamin.valentin@ml-pa.com>
Build failed (retrying...): |
18620: core: add core_mutex_debug to aid debugging deadlocks r=benpicco a=maribu ### Contribution description Adding `USEMODULE += core_mutex_debug` to your `Makefile` results in on log messages such as [mutex] waiting for thread 1 (pc = 0x800024d) being added whenever `mutex_lock()` blocks. This makes tracing down deadlocks easier. ### Testing procedure Run e.g. ```sh USEMODULE=core_mutex_debug BOARD=nucleo-f767zi make -C tests/mutex_cancel flash test ``` which should provide output such as ``` Welcome to pyterm! Type '/exit' to exit. READY s [mutex] waiting for thread 1 (pc = 0x8000f35) START main(): This is RIOT! (Version: 2022.10-devel-841-g5cc02-core/mutex/debug) Test Application for mutex_cancel / mutex_lock_cancelable ========================================================= Test without cancellation: OK Test early cancellation: OK Verify no side effects on subsequent calls: [mutex] waiting for thread 1 (pc = 0x800024d) OK Test late cancellation: [mutex] waiting for thread 1 (pc = 0x0) OK TEST PASSED ``` ```sh $ arm-none-eabi-addr2line -a 0x800024d -e tests/mutex_cancel/bin/nucleo-f767zi/tests_mutex_cancel.elf 0x0800024d /home/maribu/Repos/software/RIOT/tests/mutex_cancel/main.c:51 ``` ### Issues/PRs references Depends on and includes #18619 19296: nanocoap: allow to define CoAP resources as XFA r=benpicco a=benpicco Co-authored-by: Marian Buschsieweke <marian.buschsieweke@ovgu.de> Co-authored-by: Benjamin Valentin <benjamin.valentin@bht-berlin.de> Co-authored-by: Benjamin Valentin <benjamin.valentin@ml-pa.com>
bors r- |
Canceled. |
bors merge |
18620: core: add core_mutex_debug to aid debugging deadlocks r=maribu a=maribu ### Contribution description Adding `USEMODULE += core_mutex_debug` to your `Makefile` results in on log messages such as [mutex] waiting for thread 1 (pc = 0x800024d) being added whenever `mutex_lock()` blocks. This makes tracing down deadlocks easier. ### Testing procedure Run e.g. ```sh USEMODULE=core_mutex_debug BOARD=nucleo-f767zi make -C tests/mutex_cancel flash test ``` which should provide output such as ``` Welcome to pyterm! Type '/exit' to exit. READY s [mutex] waiting for thread 1 (pc = 0x8000f35) START main(): This is RIOT! (Version: 2022.10-devel-841-g5cc02-core/mutex/debug) Test Application for mutex_cancel / mutex_lock_cancelable ========================================================= Test without cancellation: OK Test early cancellation: OK Verify no side effects on subsequent calls: [mutex] waiting for thread 1 (pc = 0x800024d) OK Test late cancellation: [mutex] waiting for thread 1 (pc = 0x0) OK TEST PASSED ``` ```sh $ arm-none-eabi-addr2line -a 0x800024d -e tests/mutex_cancel/bin/nucleo-f767zi/tests_mutex_cancel.elf 0x0800024d /home/maribu/Repos/software/RIOT/tests/mutex_cancel/main.c:51 ``` ### Issues/PRs references Depends on and includes #18619 Co-authored-by: Marian Buschsieweke <marian.buschsieweke@ovgu.de>
bors r- |
Canceled. |
Build succeeded: |
Contribution description
Adding
USEMODULE += core_mutex_debug
to yourMakefile
results inon log messages such as
being added whenever
mutex_lock()
blocks. This makes tracing downdeadlocks easier.
Testing procedure
Run e.g.
USEMODULE=core_mutex_debug BOARD=nucleo-f767zi make -C tests/mutex_cancel flash test
which should provide output such as
Issues/PRs references
Depends on and includes #18619