-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Up2 with nocodec topology: DSP panic with 6 SSPs #253
Comments
dmesg log excerpt
Full log here: |
@plbossart I think @libinyang is working on a similar issue. @libinyang can you help comment? |
@plbossart @keyonjie I'm working on #539 in sof repo. And I have PR for SOF: thesofproject/sof#541 to fix such issue. The issue is caused by initializing memory heap incorrectly. This will cause some memory related potential issues. |
@libinyang yes the problem goes away with your fix However @ranj063 I now have a DMIC IPC issue that appears with your DMIC alloc fix in the firmware, can you look at the topology for the nocodec case? Looks like we fix GLK but broke the rest... dmesg log with Libin's PR541+sof-apl-nocodec.m4 topology:
if I take out the DMIC from the topology things are just fine. How many engineers does it take to fix memory allocation.... |
@plbossart @ranj063 However, 549 must be applied after 541 because if the memory heap initialization issue is not fixed, the system is unstable for memory mapping changing, and many potential issues may occur. |
This is a duplicate of thesofproject/sof#539, and Libin has a PR to fix it |
@stevyan |
CI: Add remaning `grep`
WARNING: CPU: 1 PID: 9 at net/mac80211/sta_info.c:554 sta_info_insert_rcu+0x121/0x12a0 Modules linked in: CPU: 1 PID: 9 Comm: kworker/u8:1 Not tainted 5.14.0-rc7+ #253 Workqueue: phy3 ieee80211_iface_work RIP: 0010:sta_info_insert_rcu+0x121/0x12a0 ... Call Trace: ieee80211_ibss_finish_sta+0xbc/0x170 ieee80211_ibss_work+0x13f/0x7d0 ieee80211_iface_work+0x37a/0x500 process_one_work+0x357/0x850 worker_thread+0x41/0x4d0 If an Ad-Hoc node receives packets with invalid source MAC address, it hits a WARN_ON in sta_info_insert_check(), this can spam the log. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20210827144230.39944-1-yuehaibing@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This reverts commit ef5b570. Zhouyi reported that commit is causing crashes when running rcutorture with KASAN enabled: BUG: using smp_processor_id() in preemptible [00000000] code: rcu_torture_rea/100 caller is rcu_preempt_deferred_qs_irqrestore+0x74/0xed0 CPU: 4 PID: 100 Comm: rcu_torture_rea Tainted: G W 5.19.0-rc5-next-20220708-dirty #253 Call Trace: dump_stack_lvl+0xbc/0x108 (unreliable) check_preemption_disabled+0x154/0x160 rcu_preempt_deferred_qs_irqrestore+0x74/0xed0 __rcu_read_unlock+0x290/0x3b0 rcu_torture_read_unlock+0x30/0xb0 rcutorture_one_extend+0x198/0x810 rcu_torture_one_read+0x58c/0xc90 rcu_torture_reader+0x12c/0x360 kthread+0x1e8/0x220 ret_from_kernel_thread+0x5c/0x64 KASAN will generate instrumentation instructions around the WRITE_ONCE(local_paca->irq_soft_mask, mask): 0xc000000000295cb0 <+0>: addis r2,r12,774 0xc000000000295cb4 <+4>: addi r2,r2,16464 0xc000000000295cb8 <+8>: mflr r0 0xc000000000295cbc <+12>: bl 0xc00000000008bb4c <mcount> 0xc000000000295cc0 <+16>: mflr r0 0xc000000000295cc4 <+20>: std r31,-8(r1) 0xc000000000295cc8 <+24>: addi r3,r13,2354 0xc000000000295ccc <+28>: mr r31,r13 0xc000000000295cd0 <+32>: std r0,16(r1) 0xc000000000295cd4 <+36>: stdu r1,-48(r1) 0xc000000000295cd8 <+40>: bl 0xc000000000609b98 <__asan_store1+8> 0xc000000000295cdc <+44>: nop 0xc000000000295ce0 <+48>: li r9,1 0xc000000000295ce4 <+52>: stb r9,2354(r31) 0xc000000000295ce8 <+56>: addi r1,r1,48 0xc000000000295cec <+60>: ld r0,16(r1) 0xc000000000295cf0 <+64>: ld r31,-8(r1) 0xc000000000295cf4 <+68>: mtlr r0 If there is a context switch before "stb r9,2354(r31)", r31 may not equal to r13, in such case, irq soft mask will not work. The usual solution of marking the code ineligible for instrumentation forces the code out-of-line, which we would prefer to avoid. Christophe proposed a partial revert, but Nick raised some concerns with that. So for now do a full revert. Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com> [mpe: Construct change log based on Zhouyi's original report] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220831131052.42250-1-mpe@ellerman.id.au
Toggle deleted anonymous sets as inactive in the next generation, so users cannot perform any update on it. Clear the generation bitmask in case the transaction is aborted. The following KASAN splat shows a set element deletion for a bound anonymous set that has been already removed in the same transaction. [ 64.921510] ================================================================== [ 64.923123] BUG: KASAN: wild-memory-access in nf_tables_commit+0xa24/0x1490 [nf_tables] [ 64.924745] Write of size 8 at addr dead000000000122 by task test/890 [ 64.927903] CPU: 3 PID: 890 Comm: test Not tainted 6.3.0+ #253 [ 64.931120] Call Trace: [ 64.932699] <TASK> [ 64.934292] dump_stack_lvl+0x33/0x50 [ 64.935908] ? nf_tables_commit+0xa24/0x1490 [nf_tables] [ 64.937551] kasan_report+0xda/0x120 [ 64.939186] ? nf_tables_commit+0xa24/0x1490 [nf_tables] [ 64.940814] nf_tables_commit+0xa24/0x1490 [nf_tables] [ 64.942452] ? __kasan_slab_alloc+0x2d/0x60 [ 64.944070] ? nf_tables_setelem_notify+0x190/0x190 [nf_tables] [ 64.945710] ? kasan_set_track+0x21/0x30 [ 64.947323] nfnetlink_rcv_batch+0x709/0xd90 [nfnetlink] [ 64.948898] ? nfnetlink_rcv_msg+0x480/0x480 [nfnetlink] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add various tests to check maximum number of supported programs being attached: # ./vmtest.sh -- ./test_progs -t tc_opts [...] ./test_progs -t tc_opts [ 1.185325] bpf_testmod: loading out-of-tree module taints kernel. [ 1.186826] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.270123] tsc: Refined TSC clocksource calibration: 3407.988 MHz [ 1.272428] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc932722, max_idle_ns: 440795381586 ns [ 1.276408] clocksource: Switched to clocksource tsc #252 tc_opts_after:OK #253 tc_opts_append:OK #254 tc_opts_basic:OK #255 tc_opts_before:OK #256 tc_opts_chain_classic:OK #257 tc_opts_chain_mixed:OK #258 tc_opts_delete_empty:OK #259 tc_opts_demixed:OK #260 tc_opts_detach:OK #261 tc_opts_detach_after:OK #262 tc_opts_detach_before:OK #263 tc_opts_dev_cleanup:OK #264 tc_opts_invalid:OK #265 tc_opts_max:OK <--- (new test) #266 tc_opts_mixed:OK #267 tc_opts_prepend:OK #268 tc_opts_replace:OK #269 tc_opts_revision:OK Summary: 18/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230929204121.20305-2-daniel@iogearbox.net
Add a new test case which performs double query of the bpf_mprog through libbpf API, but also via raw bpf(2) syscall. This is testing to gather first the count and then in a subsequent probe the full information with the program array without clearing passed structs in between. # ./vmtest.sh -- ./test_progs -t tc_opts [...] ./test_progs -t tc_opts [ 1.398818] tsc: Refined TSC clocksource calibration: 3407.999 MHz [ 1.400263] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd336761, max_idle_ns: 440795243819 ns [ 1.402734] clocksource: Switched to clocksource tsc [ 1.426639] bpf_testmod: loading out-of-tree module taints kernel. [ 1.428112] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #252 tc_opts_after:OK #253 tc_opts_append:OK #254 tc_opts_basic:OK #255 tc_opts_before:OK #256 tc_opts_chain_classic:OK #257 tc_opts_chain_mixed:OK #258 tc_opts_delete_empty:OK #259 tc_opts_demixed:OK #260 tc_opts_detach:OK #261 tc_opts_detach_after:OK #262 tc_opts_detach_before:OK #263 tc_opts_dev_cleanup:OK #264 tc_opts_invalid:OK #265 tc_opts_max:OK #266 tc_opts_mixed:OK #267 tc_opts_prepend:OK #268 tc_opts_query:OK <--- (new test) #269 tc_opts_replace:OK #270 tc_opts_revision:OK Summary: 19/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20231006220655.1653-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add a new test case to query on an empty bpf_mprog and pass the revision directly into expected_revision for attachment to assert that this does succeed. ./test_progs -t tc_opts [ 1.406778] tsc: Refined TSC clocksource calibration: 3407.990 MHz [ 1.408863] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcaf6eb0, max_idle_ns: 440795321766 ns [ 1.412419] clocksource: Switched to clocksource tsc [ 1.428671] bpf_testmod: loading out-of-tree module taints kernel. [ 1.430260] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel #252 tc_opts_after:OK #253 tc_opts_append:OK #254 tc_opts_basic:OK #255 tc_opts_before:OK #256 tc_opts_chain_classic:OK #257 tc_opts_chain_mixed:OK #258 tc_opts_delete_empty:OK #259 tc_opts_demixed:OK #260 tc_opts_detach:OK #261 tc_opts_detach_after:OK #262 tc_opts_detach_before:OK #263 tc_opts_dev_cleanup:OK #264 tc_opts_invalid:OK #265 tc_opts_max:OK #266 tc_opts_mixed:OK #267 tc_opts_prepend:OK #268 tc_opts_query:OK #269 tc_opts_query_attach:OK <--- (new test) #270 tc_opts_replace:OK #271 tc_opts_revision:OK Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20231006220655.1653-6-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
With the usual 6 SSP+1DMIC topology the DSP panics. When the SSP5 is commented out from the topology file things work fine.
I suspect an off-by-one error. Filing as a kernel bug for now but this is more of a firmware issue I suspect.
Differences in the topology file to make things work:
The text was updated successfully, but these errors were encountered: