Skip to content

Commit

Permalink
AOSCOS: Revert "rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU inv…
Browse files Browse the repository at this point in the history
…ocation"

When this change was introduced between v6.10.4 and v6.10.5, the Broadcom
Tigon3 Ethernet interface (tg3) found on Apple MacBook Pro (15'',
Mid 2010) would throw many rcu stall errors during boot up, causing
peripherals such as the wireless card to misbehave.

[   24.153855] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 21 jiffies s: 973 root: 0x4/.
[   24.166938] rcu: blocking rcu_node structures (internal RCU debug):
[   24.177800] Sending NMI from CPU 3 to CPUs 2:
[   24.183113] NMI backtrace for cpu 2
[   24.183119] CPU: 2 PID: 1049 Comm: NetworkManager Not tainted 6.10.5-aosc-main #1
[   24.183123] Hardware name: Apple Inc. MacBookPro6,2/Mac-F22586C8, BIOS    MBP61.88Z.005D.B00.1804100943 04/10/18
[   24.183125] RIP: 0010:__this_module+0x2d3d1/0x4f310 [tg3]
[   24.183135] Code: c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 89 f6 48 03 77 30 8b 06 <31> f6 31 ff c3 cc cc cc cc 66 0f 1f 44 00 00 90 90 90 90 90 90 90
[   24.183138] RSP: 0018:ffffbf1a011d75e8 EFLAGS: 00000082
[   24.183141] RAX: 0000000000000000 RBX: ffffa04ec78f8a00 RCX: 0000000000000000
[   24.183143] RDX: 0000000000000000 RSI: ffffbf1a00fb007c RDI: ffffa04ec78f8a00
[   24.183145] RBP: 0000000000000b50 R08: 0000000000000000 R09: 0000000000000000
[   24.183147] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000216
[   24.183148] R13: ffffbf1a011d7624 R14: ffffa04ec78f8a08 R15: ffffa04ec78f8b40
[   24.183151] FS:  00007f4c524b2140(0000) GS:ffffa05007d00000(0000) knlGS:0000000000000000
[   24.183153] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   24.183155] CR2: 00007f7025eae3e8 CR3: 00000001040f8000 CR4: 00000000000006f0
[   24.183157] Call Trace:
[   24.183162]  <NMI>
[   24.183167]  ? nmi_cpu_backtrace+0xbf/0x140
[   24.183175]  ? nmi_cpu_backtrace_handler+0x11/0x20
[   24.183181]  ? nmi_handle+0x61/0x160
[   24.183186]  ? default_do_nmi+0x42/0x110
[   24.183191]  ? exc_nmi+0x1bd/0x290
[   24.183194]  ? end_repeat_nmi+0xf/0x53
[   24.183203]  ? __this_module+0x2d3d1/0x4f310 [tg3]
[   24.183207]  ? __this_module+0x2d3d1/0x4f310 [tg3]
[   24.183210]  ? __this_module+0x2d3d1/0x4f310 [tg3]
[   24.183213]  </NMI>
[   24.183214]  <TASK>
[   24.183215]  __this_module+0x31828/0x4f310 [tg3]
[   24.183218]  ? __this_module+0x2d390/0x4f310 [tg3]
[   24.183221]  __this_module+0x398e6/0x4f310 [tg3]
[   24.183225]  __this_module+0x3baf8/0x4f310 [tg3]
[   24.183229]  __this_module+0x4733f/0x4f310 [tg3]
[   24.183233]  ? _raw_spin_unlock_irqrestore+0x25/0x70
[   24.183237]  ? __this_module+0x398e6/0x4f310 [tg3]
[   24.183241]  __this_module+0x4b943/0x4f310 [tg3]
[   24.183244]  ? delay_tsc+0x89/0xf0
[   24.183249]  ? preempt_count_sub+0x51/0x60
[   24.183254]  __this_module+0x4be4b/0x4f310 [tg3]
[   24.183258]  __dev_open+0x103/0x1c0
[   24.183265]  __dev_change_flags+0x1bd/0x230
[   24.183269]  ? rtnl_getlink+0x362/0x400
[   24.183276]  dev_change_flags+0x26/0x70
[   24.183280]  do_setlink+0xe16/0x11f0
[   24.183286]  ? __nla_validate_parse+0x61/0xd40
[   24.183295]  __rtnl_newlink+0x63d/0x9f0
[   24.183301]  ? kmem_cache_alloc_node_noprof+0x12b/0x360
[   24.183308]  ? kmalloc_trace_noprof+0x11e/0x350
[   24.183312]  ? rtnl_newlink+0x2e/0x70
[   24.183316]  rtnl_newlink+0x47/0x70
[   24.183320]  rtnetlink_rcv_msg+0x152/0x400
[   24.183324]  ? __netlink_sendskb+0x68/0x90
[   24.183329]  ? netlink_unicast+0x237/0x290
[   24.183333]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[   24.183336]  netlink_rcv_skb+0x5b/0x110
[   24.183343]  netlink_unicast+0x1a4/0x290
[   24.183347]  netlink_sendmsg+0x222/0x4a0
[   24.183350]  ? proc_get_long.constprop.0+0x116/0x210
[   24.183358]  ____sys_sendmsg+0x379/0x3b0
[   24.183363]  ? copy_msghdr_from_user+0x6d/0xb0
[   24.183368]  ___sys_sendmsg+0x86/0xe0
[   24.183372]  ? addrconf_sysctl_forward+0xf3/0x270
[   24.183378]  ? _copy_from_iter+0x8b/0x570
[   24.183384]  ? __pfx_addrconf_sysctl_forward+0x10/0x10
[   24.183388]  ? _raw_spin_unlock+0x19/0x50
[   24.183392]  ? proc_sys_call_handler+0xf3/0x2f0
[   24.183397]  ? trace_hardirqs_on+0x29/0x90
[   24.183401]  ? __fdget+0xc2/0xf0
[   24.183405]  __sys_sendmsg+0x5b/0xc0
[   24.183410]  ? syscall_trace_enter+0x110/0x1b0
[   24.183416]  do_syscall_64+0x64/0x150
[   24.183423]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

I have bisected the error to this commit. Reverting it caused no new or
perceivable issues on both the MacBook and a Zen4-based laptop. Revert
this commit as a workaround.

This reverts commit aa162aa.

Upstream report: https://bugzilla.kernel.org/show_bug.cgi?id=219390
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>

Bug: https://lore.kernel.org/all/b8da4aec-4cca-4eb0-ba87-5f8641aa2ca9@leemhuis.info/
Signed-off-by: Kexy Biscuit <kexybiscuit@aosc.io>
  • Loading branch information
MingcongBai authored and KexyBiscuit committed Feb 1, 2025
1 parent 522651a commit 1cc9963
Showing 1 changed file with 3 additions and 7 deletions.
10 changes: 3 additions & 7 deletions kernel/rcu/tree.c
Original file line number Diff line number Diff line change
Expand Up @@ -5227,15 +5227,11 @@ void rcutree_migrate_callbacks(int cpu)
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
bool needwake;

if (rcu_rdp_is_offloaded(rdp))
return;

raw_spin_lock_irqsave(&rcu_state.barrier_lock, flags);
if (rcu_segcblist_empty(&rdp->cblist)) {
raw_spin_unlock_irqrestore(&rcu_state.barrier_lock, flags);
if (rcu_rdp_is_offloaded(rdp) ||
rcu_segcblist_empty(&rdp->cblist))
return; /* No callbacks to migrate. */
}

raw_spin_lock_irqsave(&rcu_state.barrier_lock, flags);
WARN_ON_ONCE(rcu_rdp_cpu_online(rdp));
rcu_barrier_entrain(rdp);
my_rdp = this_cpu_ptr(&rcu_data);
Expand Down

0 comments on commit 1cc9963

Please sign in to comment.