Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[linux-6.6.y] x86/mce: Avoid triggering a schedule call in the NMI context #454

Conversation

leoliu-oc
Copy link
Contributor

In the original patch solution, when a UCR-type DRAM error occurs, the flow does not enter do_machine_check->mce_panic; instead, it exits after executing do_machine_check and continues with the
irqentry_exit_to_user_mode function. This flow triggers a schedule call.

Since irqentry_nmi_enter calls
__preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET), if a schedule occurs without executing irqentry_nmi_exit, the system will call the preempt_disable() function. Then, __schedule will call schedule_debug or determine in_atomic_preempt_off() to be true, leading to __schedule_bug and reporting the following error:
BUG: scheduling while atomic:……

Therefore, it is necessary to adjust the position of irqentry_nmi_enter.

In the original patch solution, when a UCR-type DRAM error occurs, the
flow does not enter do_machine_check->mce_panic; instead, it exits after
executing do_machine_check and continues with the
irqentry_exit_to_user_mode function. This flow triggers a schedule call.

Since irqentry_nmi_enter calls
__preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET), if a schedule occurs
without executing irqentry_nmi_exit, the system will call the
preempt_disable() function. Then, __schedule will call schedule_debug or
determine in_atomic_preempt_off() to be true, leading to __schedule_bug
and reporting the following error:
BUG: scheduling while atomic:……

Therefore, it is necessary to adjust the position of irqentry_nmi_enter.

Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
@deepin-ci-robot
Copy link

Hi @leoliu-oc. Thanks for your PR.

I'm waiting for a deepin-community member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign opsiff for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@opsiff
Copy link
Member

opsiff commented Nov 4, 2024

/lgtm

@Avenger-285714 Avenger-285714 merged commit 874f013 into deepin-community:linux-6.6.y Nov 4, 2024
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants