-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coredump: ARM: Ensure sp in dump is set as gdb expects #73189
Conversation
@mrkhldn can you rebase to solve conflict? |
For reference, using the coredump_gdbserver.py script and connection from gdb to get the backtrace, before this change when using the output of the test at tests/subsys/debug/coredump_backends/debug.coredump.backends.logging: (Wrong stack)
And after this change:
|
arch/arm/core/cortex_m/fault.c
Outdated
#ifdef CONFIG_DEBUG_COREDUMP | ||
z_arm_coredump_fault_sp = POINTER_TO_UINT(esf); | ||
#endif | ||
#if defined(CONFIG_ARMV7_M_ARMV8_M_MAINLINE) || defined(CONFIG_ARMV6_M_ARMV8_M_BASELINE) | ||
/* Gdb expects a stack pointer that does not include the exception stack frame in order to | ||
* unwind. So adjust the stack pointer accordingly. | ||
*/ | ||
z_arm_coredump_fault_sp += sizeof(esf->basic); | ||
|
||
#if defined(CONFIG_FPU) && defined(CONFIG_FPU_SHARING) | ||
/* Assess whether thread had been using the FP registers and add size of additional | ||
* registers if necessary | ||
*/ | ||
if ((exc_return & EXC_RETURN_STACK_FRAME_TYPE_STANDARD) == | ||
EXC_RETURN_STACK_FRAME_TYPE_EXTENDED) { | ||
z_arm_coredump_fault_sp += sizeof(esf->fpu); | ||
} | ||
#endif /* CONFIG_FPU && CONFIG_FPU_SHARING */ | ||
#endif /* CONFIG_ARMV7_M_ARMV8_M_MAINLINE || CONFIG_ARMV6_M_ARMV8_M_BASELINE */ | ||
#endif /* CONFIG_DEBUG_COREDUMP */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: how about extracting all this into a static inline
in exception.h
so the whole ifdef mess doesn't have to be implemented in two places?
*/ | ||
void z_do_kernel_oops(const struct arch_esf *esf, _callee_saved_t *callee_regs) | ||
void z_do_kernel_oops(const struct arch_esf *esf, _callee_saved_t *callee_regs, uint32_t exc_return) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me
/* Exception Return (EXC_RETURN) is provided in LR upon exception entry. | ||
* It is used to perform an exception return and to detect possible state | ||
* transition upon exception. | ||
*/ | ||
|
||
/* Prefix. Indicates that this is an EXC_RETURN value. | ||
* This field reads as 0b11111111. | ||
*/ | ||
#define EXC_RETURN_INDICATOR_PREFIX (0xFF << 24) | ||
/* bit[0]: Exception Secure. The security domain the exception was taken to. */ | ||
#define EXC_RETURN_EXCEPTION_SECURE_Pos 0 | ||
#define EXC_RETURN_EXCEPTION_SECURE_Msk \ | ||
BIT(EXC_RETURN_EXCEPTION_SECURE_Pos) | ||
#define EXC_RETURN_EXCEPTION_SECURE_Non_Secure 0 | ||
#define EXC_RETURN_EXCEPTION_SECURE_Secure EXC_RETURN_EXCEPTION_SECURE_Msk | ||
/* bit[2]: Stack Pointer selection. */ | ||
#define EXC_RETURN_SPSEL_Pos 2 | ||
#define EXC_RETURN_SPSEL_Msk BIT(EXC_RETURN_SPSEL_Pos) | ||
#define EXC_RETURN_SPSEL_MAIN 0 | ||
#define EXC_RETURN_SPSEL_PROCESS EXC_RETURN_SPSEL_Msk | ||
/* bit[3]: Mode. Indicates the Mode that was stacked from. */ | ||
#define EXC_RETURN_MODE_Pos 3 | ||
#define EXC_RETURN_MODE_Msk BIT(EXC_RETURN_MODE_Pos) | ||
#define EXC_RETURN_MODE_HANDLER 0 | ||
#define EXC_RETURN_MODE_THREAD EXC_RETURN_MODE_Msk | ||
/* bit[4]: Stack frame type. Indicates whether the stack frame is a standard | ||
* integer only stack frame or an extended floating-point stack frame. | ||
*/ | ||
#define EXC_RETURN_STACK_FRAME_TYPE_Pos 4 | ||
#define EXC_RETURN_STACK_FRAME_TYPE_Msk BIT(EXC_RETURN_STACK_FRAME_TYPE_Pos) | ||
#define EXC_RETURN_STACK_FRAME_TYPE_EXTENDED 0 | ||
#define EXC_RETURN_STACK_FRAME_TYPE_STANDARD EXC_RETURN_STACK_FRAME_TYPE_Msk | ||
/* bit[5]: Default callee register stacking. Indicates whether the default | ||
* stacking rules apply, or whether the callee registers are already on the | ||
* stack. | ||
*/ | ||
#define EXC_RETURN_CALLEE_STACK_Pos 5 | ||
#define EXC_RETURN_CALLEE_STACK_Msk BIT(EXC_RETURN_CALLEE_STACK_Pos) | ||
#define EXC_RETURN_CALLEE_STACK_SKIPPED 0 | ||
#define EXC_RETURN_CALLEE_STACK_DEFAULT EXC_RETURN_CALLEE_STACK_Msk | ||
/* bit[6]: Secure or Non-secure stack. Indicates whether a Secure or | ||
* Non-secure stack is used to restore stack frame on exception return. | ||
*/ | ||
#define EXC_RETURN_RETURN_STACK_Pos 6 | ||
#define EXC_RETURN_RETURN_STACK_Msk BIT(EXC_RETURN_RETURN_STACK_Pos) | ||
#define EXC_RETURN_RETURN_STACK_Non_Secure 0 | ||
#define EXC_RETURN_RETURN_STACK_Secure EXC_RETURN_RETURN_STACK_Msk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is only relevant for Cortex-M, maybe it should be put behind an #if defined(CONFIG_ARMV7_M_ARMV8_M_MAINLINE) || defined(CONFIG_ARMV6_M_ARMV8_M_BASELINE)
to avoid accidental misuse in Cortex-A/Cortex-R code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path to this include file is arch/arm/include/cortex_m/exception.h, so hopefully that's enough to make it clear that it's intended for Cortex-M
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed that, that's fine then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you also need to check xPSR bit 9 to see if the stack is padded an additional 4 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you also need to check xPSR bit 9 to see if the stack is padded an additional 4 bytes.
Added, thanks
a470b11
to
9e58123
Compare
Gdb is typically able to reconstruct the first two frames of the failing stack using the "pc" and "lr" registers. After that, (if the frame pointer is omitted) it appears to need the stack pointer (sp register) to point to the top of the stack before a fatal error occurred. The ARM Cortex-M processors push registers r0-r3, r12, LR, {possibly FPU registers}, PC, SPSR onto the stack before entering the exception handler. We adjust the stack pointer back to the point before these registers were pushed for preservation in the dump. During k_oops/k_panic, the sp wasn't stored in the core dump at all. Apply similar logic to store it when failures occur in that path. Signed-off-by: Mark Holden <mholden@meta.com>
9e58123
to
f4c019e
Compare
@vbrzeski would you mind checking if this resolved your issue and if so approve this PR please ? |
Thanks for adding the xPSR check! LGTM after clang-formatting the source code. |
@mrkhldn I checked your changes in my test code and it works for nRF52840.
After:
Great job! |
@mrkhldn this commit introduced a regression in main |
@mrkhldn unfortunately this PR was reverted due to the introduced regression :( . |
Fixed the regression and re-submitted here #79622 |
Gdb is typically able to reconstruct the first two frames of the failing stack using the "pc" and "lr" registers. After that, (if the frame pointer is omitted) it appears to need the stack pointer (sp register) to point to the top of the stack before a fatal error occurred.
The ARM Cortex-M7 processor pushes registers r0-r3, r12, LR, {possibly FPU registers}, PC, SPSR onto the stack before entering the exception handler. We adjust the stack pointer back to the point before these registers were pushed for preservation in the dump.
During k_oops/k_panic, the sp wasn't stored in the core dump at all. Apply similar logic to store it when failures occur in that path.