-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add obs workflows #18
Merged
Zeno-sole
merged 1 commit into
deepin-community:kernel-mainline
from
Zeno-sole:kernel-mainline
Aug 29, 2023
Merged
feat: add obs workflows #18
Zeno-sole
merged 1 commit into
deepin-community:kernel-mainline
from
Zeno-sole:kernel-mainline
Aug 29, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Avenger-285714
approved these changes
Aug 29, 2023
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Avenger-285714 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
matrix-wsk
pushed a commit
that referenced
this pull request
Sep 25, 2023
commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
matrix-wsk
pushed a commit
that referenced
this pull request
Oct 16, 2023
[ Upstream commit a154f5f ] The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] #7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] #8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f #9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 #10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] #11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc #12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] #13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] #14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] #15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] #16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 #17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] #18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] #19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 #20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
matrix-wsk
pushed a commit
that referenced
this pull request
Dec 14, 2023
[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 #15 [ffff88aa841efb88] device_del at ffffffff82179d23 #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff #20 [ffff88aa841eff10] kthread at ffffffff811d87a0 #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20231130081415.891006-1-lishifeng@sangfor.com.cn Suggested-by: "Ismail, Mustafa" <mustafa.ismail@intel.com> Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Avenger-285714
pushed a commit
to Avenger-285714/DeepinKernel
that referenced
this pull request
May 5, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e deepin-community#3 [fffffe00003fced0] do_nmi at ffffffff8922660d deepin-community#4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 deepin-community#5 [ffffa655314979e8] io_serial_in at ffffffff89792594 deepin-community#6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 deepin-community#7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 deepin-community#8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 deepin-community#9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 deepin-community#10 [ffffa65531497ac8] console_unlock at ffffffff89316124 deepin-community#11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 deepin-community#12 [ffffa65531497b68] printk at ffffffff89318306 deepin-community#13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 deepin-community#14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] deepin-community#15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] deepin-community#16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] deepin-community#17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] deepin-community#18 [ffffa65531497f10] kthread at ffffffff892d2e72 deepin-community#19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <lei.chen@smartx.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Avenger-285714
pushed a commit
that referenced
this pull request
May 5, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <lei.chen@smartx.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
MingcongBai
pushed a commit
to MingcongBai/deepin-kernel
that referenced
this pull request
Jun 17, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 deepin-community#1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) deepin-community#2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) deepin-community#3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 deepin-community#4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 deepin-community#5 0x55c94045c776 in ui_browser__show ui/browser.c:288 deepin-community#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 deepin-community#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 deepin-community#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 deepin-community#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 deepin-community#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 deepin-community#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 deepin-community#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 deepin-community#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 deepin-community#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 deepin-community#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 deepin-community#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 deepin-community#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 deepin-community#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 deepin-community#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 deepin-community#20 0x55c9400e53ad in main tools/perf/perf.c:561 deepin-community#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 deepin-community#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 deepin-community#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Avenger-285714
pushed a commit
to Avenger-285714/DeepinKernel
that referenced
this pull request
Jun 21, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) deepin-community#3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 deepin-community#4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 deepin-community#5 0x55c94045c776 in ui_browser__show ui/browser.c:288 deepin-community#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 deepin-community#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 deepin-community#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 deepin-community#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 deepin-community#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 deepin-community#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 deepin-community#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 deepin-community#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 deepin-community#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 deepin-community#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 deepin-community#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 deepin-community#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 deepin-community#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 deepin-community#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 deepin-community#20 0x55c9400e53ad in main tools/perf/perf.c:561 deepin-community#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 deepin-community#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 deepin-community#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Avenger-285714
pushed a commit
to Avenger-285714/DeepinKernel
that referenced
this pull request
Jul 5, 2024
commit be346c1a6eeb49d8fda827d2a9522124c2f72f36 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 deepin-community#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] deepin-community#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] deepin-community#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] deepin-community#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] deepin-community#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] deepin-community#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] deepin-community#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] deepin-community#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] deepin-community#11 dio_complete at ffffffff8c2b9fa7 deepin-community#12 do_blockdev_direct_IO at ffffffff8c2bc09f deepin-community#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] deepin-community#14 generic_file_direct_write at ffffffff8c1dcf14 deepin-community#15 __generic_file_write_iter at ffffffff8c1dd07b deepin-community#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] deepin-community#17 aio_write at ffffffff8c2cc72e deepin-community#18 kmem_cache_alloc at ffffffff8c248dde deepin-community#19 do_io_submit at ffffffff8c2ccada deepin-community#20 do_syscall_64 at ffffffff8c004984 deepin-community#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
opsiff
pushed a commit
to opsiff/UOS-kernel
that referenced
this pull request
Jul 29, 2024
[ Upstream commit f0c18025693707ec344a70b6887f7450bf4c826b ] When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch platform, the following kernel panic occurs: [...] Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ deepin-community#18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...) Stack : ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- [...] This crash happens every time when running sockmap_skb_verdict_shutdown subtest in sockmap_basic. This crash is because a NULL pointer is passed to page_address() in the sk_msg_recvmsg(). Due to the different implementations depending on the architecture, page_address(NULL) will trigger a panic on Loongarch platform but not on x86 platform. So this bug was hidden on x86 platform for a while, but now it is exposed on Loongarch platform. The root cause is that a zero length skb (skb->len == 0) was put on the queue. This zero length skb is a TCP FIN packet, which was sent by shutdown(), invoked in test_sockmap_skb_verdict_shutdown(): shutdown(p1, SHUT_WR); In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list. And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it to kmap_local_page() and to page_address(), then kernel panics. To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(), if copy is zero, that means it's a zero length skb, skip invoking copy_page_to_iter(). We are using the EFAULT return triggered by copy_page_to_iter to check for is_fin in tcp_bpf.c. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Suggested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit b180739b45a38b4caa88fe16bb5273072e6613dc)
opsiff
pushed a commit
that referenced
this pull request
Aug 7, 2024
It also casue FT-D3000 board m70f request_irq register a handler 'phytium_spi_irq', which calls spi_master_get_devdata. so it should be called after spi_master_set_devdata, where the handler can really get the fts structure. Log: [ 1.535532] irq 18: nobody cared (try booting with the "irqpoll" option) [ 1.542258] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-arm64-desktop #5300 [ 1.542260] Hardware name: N/A N/A/L21K-2041K, BIOS KL4.27.BD.D.062.D 03/10/2022 [ 1.542261] Call trace: [ 1.542267] dump_backtrace+0x0/0x190 [ 1.542269] show_stack+0x14/0x20 [ 1.542273] dump_stack+0xa8/0xcc [ 1.542277] __report_bad_irq+0x48/0x100 [ 1.542279] note_interrupt+0x280/0x2e4 [ 1.542281] handle_irq_event_percpu+0x54/0x68 [ 1.542283] handle_irq_event+0x40/0x98 [ 1.542285] handle_fasteoi_irq+0xd4/0x1a0 [ 1.542287] generic_handle_irq+0x2c/0x40 [ 1.542289] __handle_domain_irq+0x60/0xb8 [ 1.542291] gic_handle_irq+0x7c/0x178 [ 1.542292] el1_irq+0xb0/0x140 [ 1.542294] __do_softirq+0x84/0x2e8 [ 1.542297] irq_exit+0x9c/0xb8 [ 1.542298] __handle_domain_irq+0x64/0xb8 [ 1.542300] gic_handle_irq+0x7c/0x178 [ 1.542301] el1_irq+0xb0/0x140 [ 1.542303] __setup_irq+0x478/0x6d8 [ 1.542305] request_threaded_irq+0xdc/0x198 [ 1.542308] phytium_spi_add_host+0x70/0x198 [ 1.542310] phytium_spi_probe+0x2ec/0x328 [ 1.542313] platform_drv_probe+0x50/0xa0 [ 1.542315] really_probe+0x23c/0x3c8 [ 1.542317] driver_probe_device+0xdc/0x130 [ 1.542318] __driver_attach+0x128/0x150 [ 1.542320] bus_for_each_dev+0x60/0x98 [ 1.542322] driver_attach+0x20/0x28 [ 1.542323] bus_add_driver+0x1a0/0x280 [ 1.542325] driver_register+0x60/0x110 [ 1.542327] __platform_driver_register+0x44/0x50 [ 1.542330] phytium_spi_driver_init+0x18/0x20 [ 1.542332] do_one_initcall+0x30/0x19c [ 1.542335] kernel_init_freeable+0x27c/0x320 [ 1.542338] kernel_init+0x10/0x100 [ 1.542340] ret_from_fork+0x10/0x18 [ 1.542341] handlers: [ 1.544611] [<0000000057fd3891>] phytium_spi_irq [ 1.549235] Disabling IRQ #18 same problem link: https://gitee.com/openeuler/kernel/commit/7e9c59f90e0d0b78aa3ad27fc2baf37b9333b41f Signed-off-by: wenlunpeng <wenlunpeng@uniontech.com> Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Avenger-285714
pushed a commit
that referenced
this pull request
Aug 7, 2024
It also casue FT-D3000 board m70f request_irq register a handler 'phytium_spi_irq', which calls spi_master_get_devdata. so it should be called after spi_master_set_devdata, where the handler can really get the fts structure. Log: [ 1.535532] irq 18: nobody cared (try booting with the "irqpoll" option) [ 1.542258] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-arm64-desktop #5300 [ 1.542260] Hardware name: N/A N/A/L21K-2041K, BIOS KL4.27.BD.D.062.D 03/10/2022 [ 1.542261] Call trace: [ 1.542267] dump_backtrace+0x0/0x190 [ 1.542269] show_stack+0x14/0x20 [ 1.542273] dump_stack+0xa8/0xcc [ 1.542277] __report_bad_irq+0x48/0x100 [ 1.542279] note_interrupt+0x280/0x2e4 [ 1.542281] handle_irq_event_percpu+0x54/0x68 [ 1.542283] handle_irq_event+0x40/0x98 [ 1.542285] handle_fasteoi_irq+0xd4/0x1a0 [ 1.542287] generic_handle_irq+0x2c/0x40 [ 1.542289] __handle_domain_irq+0x60/0xb8 [ 1.542291] gic_handle_irq+0x7c/0x178 [ 1.542292] el1_irq+0xb0/0x140 [ 1.542294] __do_softirq+0x84/0x2e8 [ 1.542297] irq_exit+0x9c/0xb8 [ 1.542298] __handle_domain_irq+0x64/0xb8 [ 1.542300] gic_handle_irq+0x7c/0x178 [ 1.542301] el1_irq+0xb0/0x140 [ 1.542303] __setup_irq+0x478/0x6d8 [ 1.542305] request_threaded_irq+0xdc/0x198 [ 1.542308] phytium_spi_add_host+0x70/0x198 [ 1.542310] phytium_spi_probe+0x2ec/0x328 [ 1.542313] platform_drv_probe+0x50/0xa0 [ 1.542315] really_probe+0x23c/0x3c8 [ 1.542317] driver_probe_device+0xdc/0x130 [ 1.542318] __driver_attach+0x128/0x150 [ 1.542320] bus_for_each_dev+0x60/0x98 [ 1.542322] driver_attach+0x20/0x28 [ 1.542323] bus_add_driver+0x1a0/0x280 [ 1.542325] driver_register+0x60/0x110 [ 1.542327] __platform_driver_register+0x44/0x50 [ 1.542330] phytium_spi_driver_init+0x18/0x20 [ 1.542332] do_one_initcall+0x30/0x19c [ 1.542335] kernel_init_freeable+0x27c/0x320 [ 1.542338] kernel_init+0x10/0x100 [ 1.542340] ret_from_fork+0x10/0x18 [ 1.542341] handlers: [ 1.544611] [<0000000057fd3891>] phytium_spi_irq [ 1.549235] Disabling IRQ #18 same problem link: https://gitee.com/openeuler/kernel/commit/7e9c59f90e0d0b78aa3ad27fc2baf37b9333b41f Signed-off-by: wenlunpeng <wenlunpeng@uniontech.com> Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Avenger-285714
pushed a commit
that referenced
this pull request
Aug 12, 2024
[ Upstream commit f0c18025693707ec344a70b6887f7450bf4c826b ] When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch platform, the following kernel panic occurs: [...] Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...) Stack : ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- [...] This crash happens every time when running sockmap_skb_verdict_shutdown subtest in sockmap_basic. This crash is because a NULL pointer is passed to page_address() in the sk_msg_recvmsg(). Due to the different implementations depending on the architecture, page_address(NULL) will trigger a panic on Loongarch platform but not on x86 platform. So this bug was hidden on x86 platform for a while, but now it is exposed on Loongarch platform. The root cause is that a zero length skb (skb->len == 0) was put on the queue. This zero length skb is a TCP FIN packet, which was sent by shutdown(), invoked in test_sockmap_skb_verdict_shutdown(): shutdown(p1, SHUT_WR); In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list. And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it to kmap_local_page() and to page_address(), then kernel panics. To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(), if copy is zero, that means it's a zero length skb, skip invoking copy_page_to_iter(). We are using the EFAULT return triggered by copy_page_to_iter to check for is_fin in tcp_bpf.c. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Suggested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit b180739b45a38b4caa88fe16bb5273072e6613dc)
matrix-wsk
pushed a commit
to matrix-wsk/kernel-6.6
that referenced
this pull request
Sep 4, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 deepin-community#1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 deepin-community#2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e deepin-community#3 [fffffe00003fced0] do_nmi at ffffffff8922660d deepin-community#4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 deepin-community#5 [ffffa655314979e8] io_serial_in at ffffffff89792594 deepin-community#6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 deepin-community#7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 deepin-community#8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 deepin-community#9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 deepin-community#10 [ffffa65531497ac8] console_unlock at ffffffff89316124 deepin-community#11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 deepin-community#12 [ffffa65531497b68] printk at ffffffff89318306 deepin-community#13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 deepin-community#14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] deepin-community#15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] deepin-community#16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] deepin-community#17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] deepin-community#18 [ffffa65531497f10] kthread at ffffffff892d2e72 deepin-community#19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <lei.chen@smartx.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Avenger-285714
pushed a commit
to Avenger-285714/DeepinKernel
that referenced
this pull request
Sep 12, 2024
[ Upstream commit c145eea2f75ff7949392aebecf7ef0a81c1f6c14 ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty deepin-community#18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20240703072409.556618-1-s.hauer@pengutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
opsiff
pushed a commit
that referenced
this pull request
Sep 13, 2024
[ Upstream commit c145eea2f75ff7949392aebecf7ef0a81c1f6c14 ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20240703072409.556618-1-s.hauer@pengutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
opsiff
pushed a commit
that referenced
this pull request
Oct 17, 2024
…the Crashkernel Scenario [ Upstream commit 4058c39bd176daf11a826802d940d86292a6b02b ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC #18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <shawn.shao@jaguarmicro.com> Link: https://lore.kernel.org/r/20240830031709.134-1-shawn.shao@jaguarmicro.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 0be52823e51c2eece65a341fba721196cf17e1c0)
Avenger-285714
pushed a commit
that referenced
this pull request
Oct 18, 2024
…the Crashkernel Scenario [ Upstream commit 4058c39bd176daf11a826802d940d86292a6b02b ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC #18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <shawn.shao@jaguarmicro.com> Link: https://lore.kernel.org/r/20240830031709.134-1-shawn.shao@jaguarmicro.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit 0be52823e51c2eece65a341fba721196cf17e1c0)
Avenger-285714
pushed a commit
to Avenger-285714/DeepinKernel
that referenced
this pull request
Nov 1, 2024
[ Upstream commit 373b9338c9722a368925d83bc622c596896b328e ] Uprobe needs to fetch args into a percpu buffer, and then copy to ring buffer to avoid non-atomic context problem. Sometimes user-space strings, arrays can be very large, but the size of percpu buffer is only page size. And store_trace_args() won't check whether these data exceeds a single page or not, caused out-of-bounds memory access. It could be reproduced by following steps: 1. build kernel with CONFIG_KASAN enabled 2. save follow program as test.c ``` \#include <stdio.h> \#include <stdlib.h> \#include <string.h> // If string length large than MAX_STRING_SIZE, the fetch_store_strlen() // will return 0, cause __get_data_size() return shorter size, and // store_trace_args() will not trigger out-of-bounds access. // So make string length less than 4096. \#define STRLEN 4093 void generate_string(char *str, int n) { int i; for (i = 0; i < n; ++i) { char c = i % 26 + 'a'; str[i] = c; } str[n-1] = '\0'; } void print_string(char *str) { printf("%s\n", str); } int main() { char tmp[STRLEN]; generate_string(tmp, STRLEN); print_string(tmp); return 0; } ``` 3. compile program `gcc -o test test.c` 4. get the offset of `print_string()` ``` objdump -t test | grep -w print_string 0000000000401199 g F .text 000000000000001b print_string ``` 5. configure uprobe with offset 0x1199 ``` off=0x1199 cd /sys/kernel/debug/tracing/ echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring" > uprobe_events echo 1 > events/uprobes/enable echo 1 > tracing_on ``` 6. run `test`, and kasan will report error. ================================================================== BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0 Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ deepin-community#18 Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_address_description.constprop.0+0x27/0x310 kasan_report+0x10f/0x120 ? strncpy_from_user+0x1d6/0x1f0 strncpy_from_user+0x1d6/0x1f0 ? rmqueue.constprop.0+0x70d/0x2ad0 process_fetch_insn+0xb26/0x1470 ? __pfx_process_fetch_insn+0x10/0x10 ? _raw_spin_lock+0x85/0xe0 ? __pfx__raw_spin_lock+0x10/0x10 ? __pte_offset_map+0x1f/0x2d0 ? unwind_next_frame+0xc5f/0x1f80 ? arch_stack_walk+0x68/0xf0 ? is_bpf_text_address+0x23/0x30 ? kernel_text_address.part.0+0xbb/0xd0 ? __kernel_text_address+0x66/0xb0 ? unwind_get_return_address+0x5e/0xa0 ? __pfx_stack_trace_consume_entry+0x10/0x10 ? arch_stack_walk+0xa2/0xf0 ? _raw_spin_lock_irqsave+0x8b/0xf0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? depot_alloc_stack+0x4c/0x1f0 ? _raw_spin_unlock_irqrestore+0xe/0x30 ? stack_depot_save_flags+0x35d/0x4f0 ? kasan_save_stack+0x34/0x50 ? kasan_save_stack+0x24/0x50 ? mutex_lock+0x91/0xe0 ? __pfx_mutex_lock+0x10/0x10 prepare_uprobe_buffer.part.0+0x2cd/0x500 uprobe_dispatcher+0x2c3/0x6a0 ? __pfx_uprobe_dispatcher+0x10/0x10 ? __kasan_slab_alloc+0x4d/0x90 handler_chain+0xdd/0x3e0 handle_swbp+0x26e/0x3d0 ? __pfx_handle_swbp+0x10/0x10 ? uprobe_pre_sstep_notifier+0x151/0x1b0 irqentry_exit_to_user_mode+0xe2/0x1b0 asm_exc_int3+0x39/0x40 RIP: 0033:0x401199 Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206 RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2 RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0 RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20 R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040 R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000 </TASK> This commit enforces the buffer's maxlen less than a page-size to avoid store_trace_args() out-of-memory access. Link: https://lore.kernel.org/all/20241015060148.1108331-1-mqaio@linux.alibaba.com/ Fixes: dcad1a2 ("tracing/uprobes: Fetch args before reserving a ring buffer") Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
opsiff
pushed a commit
that referenced
this pull request
Nov 1, 2024
[ Upstream commit 373b9338c9722a368925d83bc622c596896b328e ] Uprobe needs to fetch args into a percpu buffer, and then copy to ring buffer to avoid non-atomic context problem. Sometimes user-space strings, arrays can be very large, but the size of percpu buffer is only page size. And store_trace_args() won't check whether these data exceeds a single page or not, caused out-of-bounds memory access. It could be reproduced by following steps: 1. build kernel with CONFIG_KASAN enabled 2. save follow program as test.c ``` \#include <stdio.h> \#include <stdlib.h> \#include <string.h> // If string length large than MAX_STRING_SIZE, the fetch_store_strlen() // will return 0, cause __get_data_size() return shorter size, and // store_trace_args() will not trigger out-of-bounds access. // So make string length less than 4096. \#define STRLEN 4093 void generate_string(char *str, int n) { int i; for (i = 0; i < n; ++i) { char c = i % 26 + 'a'; str[i] = c; } str[n-1] = '\0'; } void print_string(char *str) { printf("%s\n", str); } int main() { char tmp[STRLEN]; generate_string(tmp, STRLEN); print_string(tmp); return 0; } ``` 3. compile program `gcc -o test test.c` 4. get the offset of `print_string()` ``` objdump -t test | grep -w print_string 0000000000401199 g F .text 000000000000001b print_string ``` 5. configure uprobe with offset 0x1199 ``` off=0x1199 cd /sys/kernel/debug/tracing/ echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring" > uprobe_events echo 1 > events/uprobes/enable echo 1 > tracing_on ``` 6. run `test`, and kasan will report error. ================================================================== BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0 Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ #18 Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_address_description.constprop.0+0x27/0x310 kasan_report+0x10f/0x120 ? strncpy_from_user+0x1d6/0x1f0 strncpy_from_user+0x1d6/0x1f0 ? rmqueue.constprop.0+0x70d/0x2ad0 process_fetch_insn+0xb26/0x1470 ? __pfx_process_fetch_insn+0x10/0x10 ? _raw_spin_lock+0x85/0xe0 ? __pfx__raw_spin_lock+0x10/0x10 ? __pte_offset_map+0x1f/0x2d0 ? unwind_next_frame+0xc5f/0x1f80 ? arch_stack_walk+0x68/0xf0 ? is_bpf_text_address+0x23/0x30 ? kernel_text_address.part.0+0xbb/0xd0 ? __kernel_text_address+0x66/0xb0 ? unwind_get_return_address+0x5e/0xa0 ? __pfx_stack_trace_consume_entry+0x10/0x10 ? arch_stack_walk+0xa2/0xf0 ? _raw_spin_lock_irqsave+0x8b/0xf0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? depot_alloc_stack+0x4c/0x1f0 ? _raw_spin_unlock_irqrestore+0xe/0x30 ? stack_depot_save_flags+0x35d/0x4f0 ? kasan_save_stack+0x34/0x50 ? kasan_save_stack+0x24/0x50 ? mutex_lock+0x91/0xe0 ? __pfx_mutex_lock+0x10/0x10 prepare_uprobe_buffer.part.0+0x2cd/0x500 uprobe_dispatcher+0x2c3/0x6a0 ? __pfx_uprobe_dispatcher+0x10/0x10 ? __kasan_slab_alloc+0x4d/0x90 handler_chain+0xdd/0x3e0 handle_swbp+0x26e/0x3d0 ? __pfx_handle_swbp+0x10/0x10 ? uprobe_pre_sstep_notifier+0x151/0x1b0 irqentry_exit_to_user_mode+0xe2/0x1b0 asm_exc_int3+0x39/0x40 RIP: 0033:0x401199 Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206 RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2 RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0 RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20 R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040 R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000 </TASK> This commit enforces the buffer's maxlen less than a page-size to avoid store_trace_args() out-of-memory access. Link: https://lore.kernel.org/all/20241015060148.1108331-1-mqaio@linux.alibaba.com/ Fixes: dcad1a2 ("tracing/uprobes: Fetch args before reserving a ring buffer") Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.