Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SCU88 register and SCU90 init values #69

Closed
wants to merge 1 commit into from
Closed

Update SCU88 register and SCU90 init values #69

wants to merge 1 commit into from

Conversation

agangidi53
Copy link

Before This change:
a) SCU90[0]=1, function pin Incorrectly defined. It must be pull down internally. Because of this networking has an intermittent failure
b) For SCU88, bits 7:0 were set to 1. That is: We were reading : PWMx or VPIGx instead of GPIONx (GPIONx gives us the PCIe inventory status, where x is bit number). Because of this PCIe inventory was showing up wrong in Barreleye server.

After This Change:
a) SCU90[0]=0 corrected
b) Bits (7:0) of SCU 88 are set to 0 . (According to Page 111 of data sheet these have to be set to 0 for us to to read GPION0 to GPIO N7 which indicate if PCIe device is present ). This is specific to Barreleye. Since SCU88 seems to be specific to how board is wired, I added a step to reinitialize this register in device specific setup.

Description of pins 0 of SCU 90:
Enable SD1 Function Pin

Description of pins 7:0 of SCU 88:
7 RW Enable PWM7 or VPIG7 function pin (SCU90[5:4]=0x2 select Video pin)
6 RW Enable PWM6 or VPIG6 function pin (SCU90[5:4]=0x2 select Video pin)
5 RW Enable PWM5 or VPIG5 function pin (SCU90[5:4]!=0 select Video pin)
4 RW Enable PWM4 or VPIG4 function pin (SCU90[5:4]!=0 select Video pin)
3 RW Enable PWM3 or VPIG3 function pin (SCU90[5:4]!=0 select Video pin)
2 RW Enable PWM2 or VPIG2 function pin (SCU90[5:4]!=0 select Video pin)
1 RW Enable PWM1 or VPIG1 function pin (SCU90[5:4]=0x3 select Video pin)
0 RW Enable PWM0 or VPIG0 function pin (SCU90[5:4]=0x3 select Video pin)

Signed-off-by: Adi Gangidi adi.gangidi@rackspace.com


This change is Reviewable

Before This change:
a) SCU90[0]=1, function pin Incorrectly defined. It must be pull down internally. Because of this networking has an intermittent failure
b) For SCU88, bits 7:0 were set to 1. That is: We were reading : PWMx or VPIGx instead of GPIONx (GPIONx gives us the PCIe inventory status, where x is bit number). Because of this PCIe inventory was showing up wrong in Barreleye server.

After This Change:
a) SCU90[0]=0 corrected
b) Bits (7:0) of SCU 88 are set to 0 . (According to Page 111 of data sheet these have to be set to 0 for us to to read GPION0 to GPIO N7 which indicate if PCIe device is present ). This is specific to Barreleye. Since SCU88 seems to be specific to how board is wired, I added a step to reinitialize this register in device specific setup.

Description of pins 0 of SCU 90:
Enable SD1 Function Pin

Description of pins 7:0 of SCU 88:
7 RW Enable PWM7 or VPIG7 function pin (SCU90[5:4]=0x2 select Video pin)
6 RW Enable PWM6 or VPIG6 function pin (SCU90[5:4]=0x2 select Video pin)
5 RW Enable PWM5 or VPIG5 function pin (SCU90[5:4]!=0 select Video pin)
4 RW Enable PWM4 or VPIG4 function pin (SCU90[5:4]!=0 select Video pin)
3 RW Enable PWM3 or VPIG3 function pin (SCU90[5:4]!=0 select Video pin)
2 RW Enable PWM2 or VPIG2 function pin (SCU90[5:4]!=0 select Video pin)
1 RW Enable PWM1 or VPIG1 function pin (SCU90[5:4]=0x3 select Video pin)
0 RW Enable PWM0 or VPIG0 function pin (SCU90[5:4]=0x3 select Video pin)

Signed-off-by: Adi Gangidi adi.gangidi@rackspace.com
@shenki
Copy link
Member

shenki commented Mar 29, 2016

Please update #63 with these changes.

@shenki shenki closed this Mar 29, 2016
amboar pushed a commit to amboar/linux that referenced this pull request Sep 28, 2016
The caller expects %rdi to remain intact, push+pop it make that happen.

Fixes the following kind of explosions on my core2duo machine when
trying to reboot or shut down:

  general protection fault: 0000 [openbmc#1] PREEMPT SMP
  Modules linked in: i915 i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm netconsole configfs binfmt_misc iTCO_wdt psmouse pcspkr snd_hda_codec_idt e100 coretemp hwmon snd_hda_codec_generic i2c_i801 mii i2c_smbus lpc_ich mfd_core snd_hda_intel uhci_hcd snd_hda_codec snd_hwdep snd_hda_core ehci_pci 8250 ehci_hcd snd_pcm 8250_base usbcore evdev serial_core usb_common parport_pc parport snd_timer snd soundcore
  CPU: 0 PID: 3070 Comm: reboot Not tainted 4.8.0-rc1-perf-dirty openbmc#69
  Hardware name:                  /D946GZIS, BIOS TS94610J.86A.0087.2007.1107.1049 11/07/2007
  task: ffff88012a0b4080 task.stack: ffff880123850000
  RIP: 0010:[<ffffffff81003c92>]  [<ffffffff81003c92>] x86_perf_event_update+0x52/0xc0
  RSP: 0018:ffff880123853b60  EFLAGS: 00010087
  RAX: 0000000000000001 RBX: ffff88012fc0a3c0 RCX: 000000000000001e
  RDX: 0000000000000000 RSI: 0000000040000000 RDI: ffff88012b014800
  RBP: ffff880123853b88 R08: ffffffffffffffff R09: 0000000000000000
  R10: ffffea0004a012c0 R11: ffffea0004acedc0 R12: ffffffff80000001
  R13: ffff88012b0149c0 R14: ffff88012b014800 R15: 0000000000000018
  FS:  00007f8b155cd700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f8b155f5000 CR3: 000000012a2d7000 CR4: 00000000000006f0
  Stack:
   ffff88012fc0a3c0 ffff88012b014800 0000000000000004 0000000000000001
   ffff88012fc1b750 ffff880123853bb0 ffffffff81003d59 ffff88012b014800
   ffff88012fc0a3c0 ffff88012b014800 ffff880123853bd8 ffffffff81003e13
  Call Trace:
   [<ffffffff81003d59>] x86_pmu_stop+0x59/0xd0
   [<ffffffff81003e13>] x86_pmu_del+0x43/0x140
   [<ffffffff8111705d>] event_sched_out.isra.105+0xbd/0x260
   [<ffffffff8111738d>] __perf_remove_from_context+0x2d/0xb0
   [<ffffffff8111745d>] __perf_event_exit_context+0x4d/0x70
   [<ffffffff810c8826>] generic_exec_single+0xb6/0x140
   [<ffffffff81117410>] ? __perf_remove_from_context+0xb0/0xb0
   [<ffffffff81117410>] ? __perf_remove_from_context+0xb0/0xb0
   [<ffffffff810c898f>] smp_call_function_single+0xdf/0x140
   [<ffffffff81113d27>] perf_event_exit_cpu_context+0x87/0xc0
   [<ffffffff81113d73>] perf_reboot+0x13/0x40
   [<ffffffff8107578a>] notifier_call_chain+0x4a/0x70
   [<ffffffff81075ad7>] __blocking_notifier_call_chain+0x47/0x60
   [<ffffffff81075b06>] blocking_notifier_call_chain+0x16/0x20
   [<ffffffff81076a1d>] kernel_restart_prepare+0x1d/0x40
   [<ffffffff81076ae2>] kernel_restart+0x12/0x60
   [<ffffffff81076d56>] SYSC_reboot+0xf6/0x1b0
   [<ffffffff811a823c>] ? mntput_no_expire+0x2c/0x1b0
   [<ffffffff811a83e4>] ? mntput+0x24/0x40
   [<ffffffff811894fc>] ? __fput+0x16c/0x1e0
   [<ffffffff811895ae>] ? ____fput+0xe/0x10
   [<ffffffff81072fc3>] ? task_work_run+0x83/0xa0
   [<ffffffff81001623>] ? exit_to_usermode_loop+0x53/0xc0
   [<ffffffff8100105a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
   [<ffffffff81076e6e>] SyS_reboot+0xe/0x10
   [<ffffffff814c4ba5>] entry_SYSCALL_64_fastpath+0x18/0xa3
  Code: 7c 4c 8d af c0 01 00 00 49 89 fe eb 10 48 09 c2 4c 89 e0 49 0f b1 55 00 4c 39 e0 74 35 4d 8b a6 c0 01 00 00 41 8b 8e 60 01 00 00 <0f> 33 8b 35 6e 02 8c 00 48 c1 e2 20 85 f6 7e d2 48 89 d3 89 cf
  RIP  [<ffffffff81003c92>] x86_perf_event_update+0x52/0xc0
   RSP <ffff880123853b60>
  ---[ end trace 7ec95181faf211be ]---
  note: reboot[3070] exited with preempt_count 2

Cc: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Fixes: f596710 ("x86/hweight: Get rid of the special calling convention")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
amboar pushed a commit to amboar/linux that referenced this pull request Nov 16, 2016
Commit 63f53de ("mm: warn about allocations which stall for too
long") by error embedded "\n" in the format string, resulting in strange
output.

  [  722.876655] kworker/0:1: page alloction stalls for 160001ms, order:0
  [  722.876656] , mode:0x2400000(GFP_NOIO)
  [  722.876657] CPU: 0 PID: 6966 Comm: kworker/0:1 Not tainted 4.8.0+ openbmc#69

Link: http://lkml.kernel.org/r/1476026219-7974-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
shenki pushed a commit that referenced this pull request Aug 22, 2018
commit 9ea0a46 upstream.

mntput_no_expire() does the calculation of total refcount under mount_lock;
unfortunately, the decrement (as well as all increments) are done outside
of it, leading to false positives in the "are we dropping the last reference"
test.  Consider the following situation:
	* mnt is a lazy-umounted mount, kept alive by two opened files.  One
of those files gets closed.  Total refcount of mnt is 2.  On CPU 42
mntput(mnt) (called from __fput()) drops one reference, decrementing component
	* After it has looked at component #0, the process on CPU 0 does
mntget(), incrementing component #0, gets preempted and gets to run again -
on CPU 69.  There it does mntput(), which drops the reference (component #69)
and proceeds to spin on mount_lock.
	* On CPU 42 our first mntput() finishes counting.  It observes the
decrement of component #69, but not the increment of component #0.  As the
result, the total it gets is not 1 as it should've been - it's 0.  At which
point we decide that vfsmount needs to be killed and proceed to free it and
shut the filesystem down.  However, there's still another opened file
on that filesystem, with reference to (now freed) vfsmount, etc. and we are
screwed.

It's not a wide race, but it can be reproduced with artificial slowdown of
the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups.

Fix consists of moving the refcount decrement under mount_lock; the tricky
part is that we want (and can) keep the fast case (i.e. mount that still
has non-NULL ->mnt_ns) entirely out of mount_lock.  All places that zero
mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu()
before that mntput().  IOW, if mntput() observes (under rcu_read_lock())
a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to
be dropped.

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Jann Horn <jannh@google.com>
Fixes: 48a066e ("RCU'd vsfmounts")
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shenki pushed a commit that referenced this pull request Aug 9, 2019
commit f474c28 upstream.

powerpc hardware triggers watchpoint before executing the instruction.
To make trigger-after-execute behavior, kernel emulates the
instruction. If the instruction is 'load something into non-volatile
register', exception handler should restore emulated register state
while returning back, otherwise there will be register state
corruption. eg, adding a watchpoint on a list can corrput the list:

  # cat /proc/kallsyms | grep kthread_create_list
  c00000000121c8b8 d kthread_create_list

Add watchpoint on kthread_create_list->prev:

  # perf record -e mem:0xc00000000121c8c0

Run some workload such that new kthread gets invoked. eg, I just
logged out from console:

  list_add corruption. next->prev should be prev (c000000001214e00), \
	but was c00000000121c8b8. (next=c00000000121c8b8).
  WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0
  CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69
  ...
  NIP __list_add_valid+0xb4/0xc0
  LR __list_add_valid+0xb0/0xc0
  Call Trace:
  __list_add_valid+0xb0/0xc0 (unreliable)
  __kthread_create_on_node+0xe0/0x260
  kthread_create_on_node+0x34/0x50
  create_worker+0xe8/0x260
  worker_thread+0x444/0x560
  kthread+0x160/0x1a0
  ret_from_kernel_thread+0x5c/0x70

List corruption happened because it uses 'load into non-volatile
register' instruction:

Snippet from __kthread_create_on_node:

  c000000000136be8:     addis   r29,r2,-19
  c000000000136bec:     ld      r29,31424(r29)
        if (!__list_add_valid(new, prev, next))
  c000000000136bf0:     mr      r3,r30
  c000000000136bf4:     mr      r5,r28
  c000000000136bf8:     mr      r4,r29
  c000000000136bfc:     bl      c00000000059a2f8 <__list_add_valid+0x8>

Register state from WARN_ON():

  GPR00: c00000000059a3a0 c000007ff23afb50 c000000001344e00 0000000000000075
  GPR04: 0000000000000000 0000000000000000 0000001852af8bc1 0000000000000000
  GPR08: 0000000000000001 0000000000000007 0000000000000006 00000000000004aa
  GPR12: 0000000000000000 c000007ffffeb080 c000000000137038 c000005ff62aaa00
  GPR16: 0000000000000000 0000000000000000 c000007fffbe7600 c000007fffbe7370
  GPR20: c000007fffbe7320 c000007fffbe7300 c000000001373a00 0000000000000000
  GPR24: fffffffffffffef7 c00000000012e320 c000007ff23afcb0 c000000000cb8628
  GPR28: c00000000121c8b8 c000000001214e00 c000007fef5b17e8 c000007fef5b17c0

Watchpoint hit at 0xc000000000136bec.

  addis   r29,r2,-19
   => r29 = 0xc000000001344e00 + (-19 << 16)
   => r29 = 0xc000000001214e00

  ld      r29,31424(r29)
   => r29 = *(0xc000000001214e00 + 31424)
   => r29 = *(0xc00000000121c8c0)

0xc00000000121c8c0 is where we placed a watchpoint and thus this
instruction was emulated by emulate_step. But because handle_dabr_fault
did not restore emulated register state, r29 still contains stale
value in above register state.

Fixes: 5aae8a5 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: stable@vger.kernel.org # 2.6.36+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shenki pushed a commit that referenced this pull request Aug 9, 2019
commit f16d80b upstream.

On systems like P9 powernv where we have no TM (or P8 booted with
ppc_tm=off), userspace can construct a signal context which still has
the MSR TS bits set. The kernel tries to restore this context which
results in the following crash:

  Unexpected TM Bad Thing exception at c0000000000022fc (msr 0x8000000102a03031) tm_scratch=800000020280f033
  Oops: Unrecoverable exception, sig: 6 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 0 PID: 1636 Comm: sigfuz Not tainted 5.2.0-11043-g0a8ad0ffa4 #69
  NIP:  c0000000000022fc LR: 00007fffb2d67e48 CTR: 0000000000000000
  REGS: c00000003fffbd70 TRAP: 0700   Not tainted  (5.2.0-11045-g7142b497d8)
  MSR:  8000000102a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[E]>  CR: 42004242  XER: 00000000
  CFAR: c0000000000022e0 IRQMASK: 0
  GPR00: 0000000000000072 00007fffb2b6e560 00007fffb2d87f00 0000000000000669
  GPR04: 00007fffb2b6e728 0000000000000000 0000000000000000 00007fffb2b6f2a8
  GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR12: 0000000000000000 00007fffb2b76900 0000000000000000 0000000000000000
  GPR16: 00007fffb2370000 00007fffb2d84390 00007fffea3a15ac 000001000a250420
  GPR20: 00007fffb2b6f260 0000000010001770 0000000000000000 0000000000000000
  GPR24: 00007fffb2d843a0 00007fffea3a14a0 0000000000010000 0000000000800000
  GPR28: 00007fffea3a14d8 00000000003d0f00 0000000000000000 00007fffb2b6e728
  NIP [c0000000000022fc] rfi_flush_fallback+0x7c/0x80
  LR [00007fffb2d67e48] 0x7fffb2d67e48
  Call Trace:
  Instruction dump:
  e96a0220 e96a02a8 e96a0330 e96a03b8 394a0400 4200ffdc 7d2903a6 e92d0c00
  e94d0c08 e96d0c10 e82d0c18 7db242a6 <4c000024> 7db243a6 7db142a6 f82d0c18

The problem is the signal code assumes TM is enabled when
CONFIG_PPC_TRANSACTIONAL_MEM is enabled. This may not be the case as
with P9 powernv or if `ppc_tm=off` is used on P8.

This means any local user can crash the system.

Fix the problem by returning a bad stack frame to the user if they try
to set the MSR TS bits with sigreturn() on systems where TM is not
supported.

Found with sigfuz kernel selftest on P9.

This fixes CVE-2019-13648.

Fixes: 2b0a576 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9
Reported-by: Praveen Pandey <Praveen.Pandey@in.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190719050502.405-1-mikey@neuling.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shenki pushed a commit that referenced this pull request Oct 30, 2020
[ Upstream commit bad60b8 ]

The idx in __ath10k_htt_rx_ring_fill_n function lives in
consistent dma region writable by the device. Malfunctional
or malicious device could manipulate such idx to have a OOB
write. Either by
    htt->rx_ring.netbufs_ring[idx] = skb;
or by
    ath10k_htt_set_paddrs_ring(htt, paddr, idx);

The idx can also be negative as it's signed, giving a large
memory space to write to.

It's possibly exploitable by corruptting a legit pointer with
a skb pointer. And then fill skb with payload as rougue object.

Part of the log here. Sometimes it appears as UAF when writing
to a freed memory by chance.

 [   15.594376] BUG: unable to handle page fault for address: ffff887f5c1804f0
 [   15.595483] #PF: supervisor write access in kernel mode
 [   15.596250] #PF: error_code(0x0002) - not-present page
 [   15.597013] PGD 0 P4D 0
 [   15.597395] Oops: 0002 [#1] SMP KASAN PTI
 [   15.597967] CPU: 0 PID: 82 Comm: kworker/u2:2 Not tainted 5.6.0 #69
 [   15.598843] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
 BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 [   15.600438] Workqueue: ath10k_wq ath10k_core_register_work [ath10k_core]
 [   15.601389] RIP: 0010:__ath10k_htt_rx_ring_fill_n
 (linux/drivers/net/wireless/ath/ath10k/htt_rx.c:173) ath10k_core

Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200623221105.3486-1-bruceshenzk@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
amboar pushed a commit to amboar/linux that referenced this pull request Nov 26, 2020
[ Upstream commit 0ca9454 ]

In list_add, the first variable is the new node and the second
is the list head. The function is called with a wrong order causing
NULL dereference:

[   15.527030] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[   15.542317] Mem abort info:
[   15.545152]   ESR = 0x96000044
[   15.548248]   EC = 0x25: DABT (current EL), IL = 32 bits
[   15.553624]   SET = 0, FnV = 0
[   15.556715]   EA = 0, S1PTW = 0
[   15.559892] Data abort info:
[   15.562799]   ISV = 0, ISS = 0x00000044
[   15.566678]   CM = 0, WnR = 1
[   15.569683] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001373f0000
[   15.576196] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000
[   15.583101] Internal error: Oops: 96000044 [openbmc#1] PREEMPT SMP
[   15.588747] Modules linked in: mtk_mdp(+) cfg80211 v4l2_mem2mem videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common vide
odev mt8173_rt5650 smsc95xx usbnet ecdh_generic ecc snd_soc_rt5645 mc mt8173_afe_pcm rfkill cros_ec_sensors snd_soc_mtk_common elan_i2c crct10dif_ce cros_ec_se
nsors_core snd_soc_rl6231 elants_i2c industrialio_triggered_buffer kfifo_buf mtk_vpu cros_ec_chardev cros_usbpd_charger cros_usbpd_logger sbs_battery display_c
onnector pwm_bl ip_tables x_tables ipv6
[   15.634295] CPU: 0 PID: 188 Comm: systemd-udevd Not tainted 5.9.0-rc2+ openbmc#69
[   15.641242] Hardware name: Google Elm (DT)
[   15.645381] pstate: 20000005 (nzCv daif -PAN -UAO BTYPE=--)
[   15.651022] pc : mtk_mdp_probe+0x134/0x3a8 [mtk_mdp]
[   15.656041] lr : mtk_mdp_probe+0x128/0x3a8 [mtk_mdp]
[   15.661055] sp : ffff80001255b910
[   15.669548] x29: ffff80001255b910 x28: 0000000000000000
[   15.679973] x27: ffff800009089bf8 x26: ffff0000fafde800
[   15.690347] x25: ffff0000ff7d2768 x24: ffff800009089010
[   15.700670] x23: ffff0000f01a7cd8 x22: ffff0000fafde810
[   15.710940] x21: ffff0000f01a7c80 x20: ffff0000f0c3c180
[   15.721148] x19: ffff0000ff7f1618 x18: 0000000000000010
[   15.731289] x17: 0000000000000000 x16: 0000000000000000
[   15.741375] x15: 0000000000aaaaaa x14: 0000000000000020
[   15.751399] x13: 00000000ffffffff x12: 0000000000000020
[   15.761363] x11: 0000000000000028 x10: 0101010101010101
[   15.771279] x9 : 0000000000000004 x8 : 7f7f7f7f7f7f7f7f
[   15.781148] x7 : 646bff6171606b2b x6 : 0000000000806d65
[   15.790981] x5 : ffff0000ff7f8360 x4 : 0000000000000000
[   15.800767] x3 : 0000000000000004 x2 : 0000000000000001
[   15.810501] x1 : 0000000000000005 x0 : 0000000000000000
[   15.820171] Call trace:
[   15.826944]  mtk_mdp_probe+0x134/0x3a8 [mtk_mdp]
[   15.835908]  platform_drv_probe+0x54/0xa8
[   15.844247]  really_probe+0xe4/0x3b0
[   15.852104]  driver_probe_device+0x58/0xb8
[   15.860457]  device_driver_attach+0x74/0x80
[   15.868854]  __driver_attach+0x58/0xe0
[   15.876770]  bus_for_each_dev+0x70/0xc0
[   15.884726]  driver_attach+0x24/0x30
[   15.892374]  bus_add_driver+0x14c/0x1f0
[   15.900295]  driver_register+0x64/0x120
[   15.908168]  __platform_driver_register+0x48/0x58
[   15.916864]  mtk_mdp_driver_init+0x20/0x1000 [mtk_mdp]
[   15.925943]  do_one_initcall+0x54/0x1b4
[   15.933662]  do_init_module+0x54/0x200
[   15.941246]  load_module+0x1cf8/0x22d0
[   15.948798]  __do_sys_finit_module+0xd8/0xf0
[   15.956829]  __arm64_sys_finit_module+0x20/0x30
[   15.965082]  el0_svc_common.constprop.0+0x6c/0x168
[   15.973527]  do_el0_svc+0x24/0x90
[   15.980403]  el0_sync_handler+0x90/0x198
[   15.987867]  el0_sync+0x158/0x180
[   15.994653] Code: 9400014b 2a0003fc 35000920 f9400280 (f9000417)
[   16.004299] ---[ end trace 76fee0203f9898e5 ]---

Fixes: 86698b9 ("media: mtk-mdp: convert mtk_mdp_dev.comp array to list")
Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
amboar pushed a commit to amboar/linux that referenced this pull request Nov 26, 2020
[ Upstream commit bad60b8 ]

The idx in __ath10k_htt_rx_ring_fill_n function lives in
consistent dma region writable by the device. Malfunctional
or malicious device could manipulate such idx to have a OOB
write. Either by
    htt->rx_ring.netbufs_ring[idx] = skb;
or by
    ath10k_htt_set_paddrs_ring(htt, paddr, idx);

The idx can also be negative as it's signed, giving a large
memory space to write to.

It's possibly exploitable by corruptting a legit pointer with
a skb pointer. And then fill skb with payload as rougue object.

Part of the log here. Sometimes it appears as UAF when writing
to a freed memory by chance.

 [   15.594376] BUG: unable to handle page fault for address: ffff887f5c1804f0
 [   15.595483] #PF: supervisor write access in kernel mode
 [   15.596250] #PF: error_code(0x0002) - not-present page
 [   15.597013] PGD 0 P4D 0
 [   15.597395] Oops: 0002 [openbmc#1] SMP KASAN PTI
 [   15.597967] CPU: 0 PID: 82 Comm: kworker/u2:2 Not tainted 5.6.0 openbmc#69
 [   15.598843] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
 BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 [   15.600438] Workqueue: ath10k_wq ath10k_core_register_work [ath10k_core]
 [   15.601389] RIP: 0010:__ath10k_htt_rx_ring_fill_n
 (linux/drivers/net/wireless/ath/ath10k/htt_rx.c:173) ath10k_core

Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200623221105.3486-1-bruceshenzk@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Sep 16, 2021
[ Upstream commit fe4d557 ]

lockdep complains that in omap-aes, the list_lock is taken both with
softirqs enabled at probe time, and also in softirq context, which
could lead to a deadlock:

    ================================
    WARNING: inconsistent lock state
    5.14.0-rc1-00035-gc836005b01c5-dirty #69 Not tainted
    --------------------------------
    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    ksoftirqd/0/7 [HC0[0]:SC1[3]:HE1:SE0] takes:
    bf00e014 (list_lock){+.?.}-{2:2}, at: omap_aes_find_dev+0x18/0x54 [omap_aes_driver]
    {SOFTIRQ-ON-W} state was registered at:
      _raw_spin_lock+0x40/0x50
      omap_aes_probe+0x1d4/0x664 [omap_aes_driver]
      platform_probe+0x58/0xb8
      really_probe+0xbc/0x314
      __driver_probe_device+0x80/0xe4
      driver_probe_device+0x30/0xc8
      __driver_attach+0x70/0xf4
      bus_for_each_dev+0x70/0xb4
      bus_add_driver+0xf0/0x1d4
      driver_register+0x74/0x108
      do_one_initcall+0x84/0x2e4
      do_init_module+0x5c/0x240
      load_module+0x221c/0x2584
      sys_finit_module+0xb0/0xec
      ret_fast_syscall+0x0/0x2c
      0xbed90b30
    irq event stamp: 111800
    hardirqs last  enabled at (111800): [<c02a21e4>] __kmalloc+0x484/0x5ec
    hardirqs last disabled at (111799): [<c02a21f0>] __kmalloc+0x490/0x5ec
    softirqs last  enabled at (111776): [<c01015f0>] __do_softirq+0x2b8/0x4d0
    softirqs last disabled at (111781): [<c0135948>] run_ksoftirqd+0x34/0x50

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(list_lock);
      <Interrupt>
        lock(list_lock);

     *** DEADLOCK ***

    2 locks held by ksoftirqd/0/7:
     #0: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0x6c/0x260
     #1: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x2c/0xdc

    stack backtrace:
    CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 5.14.0-rc1-00035-gc836005b01c5-dirty #69
    Hardware name: Generic AM43 (Flattened Device Tree)
    [<c010e6e0>] (unwind_backtrace) from [<c010b9d0>] (show_stack+0x10/0x14)
    [<c010b9d0>] (show_stack) from [<c017c640>] (mark_lock.part.17+0x5bc/0xd04)
    [<c017c640>] (mark_lock.part.17) from [<c017d9e4>] (__lock_acquire+0x960/0x2fa4)
    [<c017d9e4>] (__lock_acquire) from [<c0180980>] (lock_acquire+0x10c/0x358)
    [<c0180980>] (lock_acquire) from [<c093d324>] (_raw_spin_lock_bh+0x44/0x58)
    [<c093d324>] (_raw_spin_lock_bh) from [<bf00b258>] (omap_aes_find_dev+0x18/0x54 [omap_aes_driver])
    [<bf00b258>] (omap_aes_find_dev [omap_aes_driver]) from [<bf00b328>] (omap_aes_crypt+0x94/0xd4 [omap_aes_driver])
    [<bf00b328>] (omap_aes_crypt [omap_aes_driver]) from [<c08ac6d0>] (esp_input+0x1b0/0x2c8)
    [<c08ac6d0>] (esp_input) from [<c08c9e90>] (xfrm_input+0x410/0x1290)
    [<c08c9e90>] (xfrm_input) from [<c08b6374>] (xfrm4_esp_rcv+0x54/0x11c)
    [<c08b6374>] (xfrm4_esp_rcv) from [<c0838840>] (ip_protocol_deliver_rcu+0x48/0x3bc)
    [<c0838840>] (ip_protocol_deliver_rcu) from [<c0838c50>] (ip_local_deliver_finish+0x9c/0xdc)
    [<c0838c50>] (ip_local_deliver_finish) from [<c0838dd8>] (ip_local_deliver+0x148/0x1b0)
    [<c0838dd8>] (ip_local_deliver) from [<c0838f5c>] (ip_rcv+0x11c/0x180)
    [<c0838f5c>] (ip_rcv) from [<c077e3a4>] (__netif_receive_skb_one_core+0x54/0x74)
    [<c077e3a4>] (__netif_receive_skb_one_core) from [<c077e588>] (netif_receive_skb+0xa8/0x260)
    [<c077e588>] (netif_receive_skb) from [<c068d6d4>] (cpsw_rx_handler+0x224/0x2fc)
    [<c068d6d4>] (cpsw_rx_handler) from [<c0688ccc>] (__cpdma_chan_process+0xf4/0x188)
    [<c0688ccc>] (__cpdma_chan_process) from [<c068a0c0>] (cpdma_chan_process+0x3c/0x5c)
    [<c068a0c0>] (cpdma_chan_process) from [<c0690e14>] (cpsw_rx_mq_poll+0x44/0x98)
    [<c0690e14>] (cpsw_rx_mq_poll) from [<c0780810>] (__napi_poll+0x28/0x268)
    [<c0780810>] (__napi_poll) from [<c0780c64>] (net_rx_action+0xcc/0x204)
    [<c0780c64>] (net_rx_action) from [<c0101478>] (__do_softirq+0x140/0x4d0)
    [<c0101478>] (__do_softirq) from [<c0135948>] (run_ksoftirqd+0x34/0x50)
    [<c0135948>] (run_ksoftirqd) from [<c01583b8>] (smpboot_thread_fn+0xf4/0x1d8)
    [<c01583b8>] (smpboot_thread_fn) from [<c01546dc>] (kthread+0x14c/0x174)
    [<c01546dc>] (kthread) from [<c010013c>] (ret_from_fork+0x14/0x38)
    ...

The omap-des and omap-sham drivers appear to have a similar issue.

Fix this by using spin_{,un}lock_bh() around device list access in all
the probe and remove functions.

Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Sep 20, 2021
[ Upstream commit 9de71ed ]

xfstest generic/587 reports a deadlock issue as below:

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc1 #69 Not tainted
------------------------------------------------------
repquota/8606 is trying to acquire lock:
ffff888022ac9320 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}, at: f2fs_quota_sync+0x207/0x300 [f2fs]

but task is already holding lock:
ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&sbi->quota_sem){.+.+}-{3:3}:
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_read+0x3b/0x2a0
       f2fs_quota_sync+0x59/0x300 [f2fs]
       f2fs_quota_on+0x48/0x100 [f2fs]
       do_quotactl+0x5e3/0xb30
       __x64_sys_quotactl+0x23a/0x4e0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #1 (&sbi->cp_rwsem){++++}-{3:3}:
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_read+0x3b/0x2a0
       f2fs_unlink+0x353/0x670 [f2fs]
       vfs_unlink+0x1c7/0x380
       do_unlinkat+0x413/0x4b0
       __x64_sys_unlinkat+0x50/0xb0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}:
       check_prev_add+0xdc/0xb30
       validate_chain+0xa67/0xb20
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_write+0x39/0xc0
       f2fs_quota_sync+0x207/0x300 [f2fs]
       do_quotactl+0xaff/0xb30
       __x64_sys_quotactl+0x23a/0x4e0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

Chain exists of:
  &sb->s_type->i_mutex_key#18 --> &sbi->cp_rwsem --> &sbi->quota_sem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sbi->quota_sem);
                               lock(&sbi->cp_rwsem);
                               lock(&sbi->quota_sem);
  lock(&sb->s_type->i_mutex_key#18);

 *** DEADLOCK ***

3 locks held by repquota/8606:
 #0: ffff88801efac0e0 (&type->s_umount_key#53){++++}-{3:3}, at: user_get_super+0xd9/0x190
 #1: ffff8880084bc380 (&sbi->cp_rwsem){++++}-{3:3}, at: f2fs_quota_sync+0x3e/0x300 [f2fs]
 #2: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs]

stack backtrace:
CPU: 6 PID: 8606 Comm: repquota Not tainted 5.14.0-rc1 #69
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
 dump_stack_lvl+0xce/0x134
 dump_stack+0x17/0x20
 print_circular_bug.isra.0.cold+0x239/0x253
 check_noncircular+0x1be/0x1f0
 check_prev_add+0xdc/0xb30
 validate_chain+0xa67/0xb20
 __lock_acquire+0x648/0x10b0
 lock_acquire+0x128/0x470
 down_write+0x39/0xc0
 f2fs_quota_sync+0x207/0x300 [f2fs]
 do_quotactl+0xaff/0xb30
 __x64_sys_quotactl+0x23a/0x4e0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f883b0b4efe

The root cause is ABBA deadlock of inode lock and cp_rwsem,
reorder locks in f2fs_quota_sync() as below to fix this issue:
- lock inode
- lock cp_rwsem
- lock quota_sem

Fixes: db6ec53 ("f2fs: add a rw_sem to cover quota flag changes")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Sep 22, 2021
[ Upstream commit fe4d557 ]

lockdep complains that in omap-aes, the list_lock is taken both with
softirqs enabled at probe time, and also in softirq context, which
could lead to a deadlock:

    ================================
    WARNING: inconsistent lock state
    5.14.0-rc1-00035-gc836005b01c5-dirty #69 Not tainted
    --------------------------------
    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    ksoftirqd/0/7 [HC0[0]:SC1[3]:HE1:SE0] takes:
    bf00e014 (list_lock){+.?.}-{2:2}, at: omap_aes_find_dev+0x18/0x54 [omap_aes_driver]
    {SOFTIRQ-ON-W} state was registered at:
      _raw_spin_lock+0x40/0x50
      omap_aes_probe+0x1d4/0x664 [omap_aes_driver]
      platform_probe+0x58/0xb8
      really_probe+0xbc/0x314
      __driver_probe_device+0x80/0xe4
      driver_probe_device+0x30/0xc8
      __driver_attach+0x70/0xf4
      bus_for_each_dev+0x70/0xb4
      bus_add_driver+0xf0/0x1d4
      driver_register+0x74/0x108
      do_one_initcall+0x84/0x2e4
      do_init_module+0x5c/0x240
      load_module+0x221c/0x2584
      sys_finit_module+0xb0/0xec
      ret_fast_syscall+0x0/0x2c
      0xbed90b30
    irq event stamp: 111800
    hardirqs last  enabled at (111800): [<c02a21e4>] __kmalloc+0x484/0x5ec
    hardirqs last disabled at (111799): [<c02a21f0>] __kmalloc+0x490/0x5ec
    softirqs last  enabled at (111776): [<c01015f0>] __do_softirq+0x2b8/0x4d0
    softirqs last disabled at (111781): [<c0135948>] run_ksoftirqd+0x34/0x50

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(list_lock);
      <Interrupt>
        lock(list_lock);

     *** DEADLOCK ***

    2 locks held by ksoftirqd/0/7:
     #0: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0x6c/0x260
     #1: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x2c/0xdc

    stack backtrace:
    CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 5.14.0-rc1-00035-gc836005b01c5-dirty #69
    Hardware name: Generic AM43 (Flattened Device Tree)
    [<c010e6e0>] (unwind_backtrace) from [<c010b9d0>] (show_stack+0x10/0x14)
    [<c010b9d0>] (show_stack) from [<c017c640>] (mark_lock.part.17+0x5bc/0xd04)
    [<c017c640>] (mark_lock.part.17) from [<c017d9e4>] (__lock_acquire+0x960/0x2fa4)
    [<c017d9e4>] (__lock_acquire) from [<c0180980>] (lock_acquire+0x10c/0x358)
    [<c0180980>] (lock_acquire) from [<c093d324>] (_raw_spin_lock_bh+0x44/0x58)
    [<c093d324>] (_raw_spin_lock_bh) from [<bf00b258>] (omap_aes_find_dev+0x18/0x54 [omap_aes_driver])
    [<bf00b258>] (omap_aes_find_dev [omap_aes_driver]) from [<bf00b328>] (omap_aes_crypt+0x94/0xd4 [omap_aes_driver])
    [<bf00b328>] (omap_aes_crypt [omap_aes_driver]) from [<c08ac6d0>] (esp_input+0x1b0/0x2c8)
    [<c08ac6d0>] (esp_input) from [<c08c9e90>] (xfrm_input+0x410/0x1290)
    [<c08c9e90>] (xfrm_input) from [<c08b6374>] (xfrm4_esp_rcv+0x54/0x11c)
    [<c08b6374>] (xfrm4_esp_rcv) from [<c0838840>] (ip_protocol_deliver_rcu+0x48/0x3bc)
    [<c0838840>] (ip_protocol_deliver_rcu) from [<c0838c50>] (ip_local_deliver_finish+0x9c/0xdc)
    [<c0838c50>] (ip_local_deliver_finish) from [<c0838dd8>] (ip_local_deliver+0x148/0x1b0)
    [<c0838dd8>] (ip_local_deliver) from [<c0838f5c>] (ip_rcv+0x11c/0x180)
    [<c0838f5c>] (ip_rcv) from [<c077e3a4>] (__netif_receive_skb_one_core+0x54/0x74)
    [<c077e3a4>] (__netif_receive_skb_one_core) from [<c077e588>] (netif_receive_skb+0xa8/0x260)
    [<c077e588>] (netif_receive_skb) from [<c068d6d4>] (cpsw_rx_handler+0x224/0x2fc)
    [<c068d6d4>] (cpsw_rx_handler) from [<c0688ccc>] (__cpdma_chan_process+0xf4/0x188)
    [<c0688ccc>] (__cpdma_chan_process) from [<c068a0c0>] (cpdma_chan_process+0x3c/0x5c)
    [<c068a0c0>] (cpdma_chan_process) from [<c0690e14>] (cpsw_rx_mq_poll+0x44/0x98)
    [<c0690e14>] (cpsw_rx_mq_poll) from [<c0780810>] (__napi_poll+0x28/0x268)
    [<c0780810>] (__napi_poll) from [<c0780c64>] (net_rx_action+0xcc/0x204)
    [<c0780c64>] (net_rx_action) from [<c0101478>] (__do_softirq+0x140/0x4d0)
    [<c0101478>] (__do_softirq) from [<c0135948>] (run_ksoftirqd+0x34/0x50)
    [<c0135948>] (run_ksoftirqd) from [<c01583b8>] (smpboot_thread_fn+0xf4/0x1d8)
    [<c01583b8>] (smpboot_thread_fn) from [<c01546dc>] (kthread+0x14c/0x174)
    [<c01546dc>] (kthread) from [<c010013c>] (ret_from_fork+0x14/0x38)
    ...

The omap-des and omap-sham drivers appear to have a similar issue.

Fix this by using spin_{,un}lock_bh() around device list access in all
the probe and remove functions.

Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Sep 22, 2021
[ Upstream commit 9de71ed ]

xfstest generic/587 reports a deadlock issue as below:

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc1 #69 Not tainted
------------------------------------------------------
repquota/8606 is trying to acquire lock:
ffff888022ac9320 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}, at: f2fs_quota_sync+0x207/0x300 [f2fs]

but task is already holding lock:
ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&sbi->quota_sem){.+.+}-{3:3}:
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_read+0x3b/0x2a0
       f2fs_quota_sync+0x59/0x300 [f2fs]
       f2fs_quota_on+0x48/0x100 [f2fs]
       do_quotactl+0x5e3/0xb30
       __x64_sys_quotactl+0x23a/0x4e0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #1 (&sbi->cp_rwsem){++++}-{3:3}:
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_read+0x3b/0x2a0
       f2fs_unlink+0x353/0x670 [f2fs]
       vfs_unlink+0x1c7/0x380
       do_unlinkat+0x413/0x4b0
       __x64_sys_unlinkat+0x50/0xb0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (&sb->s_type->i_mutex_key#18){+.+.}-{3:3}:
       check_prev_add+0xdc/0xb30
       validate_chain+0xa67/0xb20
       __lock_acquire+0x648/0x10b0
       lock_acquire+0x128/0x470
       down_write+0x39/0xc0
       f2fs_quota_sync+0x207/0x300 [f2fs]
       do_quotactl+0xaff/0xb30
       __x64_sys_quotactl+0x23a/0x4e0
       do_syscall_64+0x3b/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

Chain exists of:
  &sb->s_type->i_mutex_key#18 --> &sbi->cp_rwsem --> &sbi->quota_sem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sbi->quota_sem);
                               lock(&sbi->cp_rwsem);
                               lock(&sbi->quota_sem);
  lock(&sb->s_type->i_mutex_key#18);

 *** DEADLOCK ***

3 locks held by repquota/8606:
 #0: ffff88801efac0e0 (&type->s_umount_key#53){++++}-{3:3}, at: user_get_super+0xd9/0x190
 #1: ffff8880084bc380 (&sbi->cp_rwsem){++++}-{3:3}, at: f2fs_quota_sync+0x3e/0x300 [f2fs]
 #2: ffff8880084bcde8 (&sbi->quota_sem){.+.+}-{3:3}, at: f2fs_quota_sync+0x59/0x300 [f2fs]

stack backtrace:
CPU: 6 PID: 8606 Comm: repquota Not tainted 5.14.0-rc1 #69
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
 dump_stack_lvl+0xce/0x134
 dump_stack+0x17/0x20
 print_circular_bug.isra.0.cold+0x239/0x253
 check_noncircular+0x1be/0x1f0
 check_prev_add+0xdc/0xb30
 validate_chain+0xa67/0xb20
 __lock_acquire+0x648/0x10b0
 lock_acquire+0x128/0x470
 down_write+0x39/0xc0
 f2fs_quota_sync+0x207/0x300 [f2fs]
 do_quotactl+0xaff/0xb30
 __x64_sys_quotactl+0x23a/0x4e0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f883b0b4efe

The root cause is ABBA deadlock of inode lock and cp_rwsem,
reorder locks in f2fs_quota_sync() as below to fix this issue:
- lock inode
- lock cp_rwsem
- lock quota_sem

Fixes: db6ec53 ("f2fs: add a rw_sem to cover quota flag changes")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Oct 24, 2022
…te()

[ Upstream commit f020d95 ]

When peer delete failed in a disconnect operation, use-after-free
detected by KFENCE in below log. It is because for each vdev_id and
address, it has only one struct ath10k_peer, it is allocated in
ath10k_peer_map_event(). When connected to an AP, it has more than
one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the
array peer_map of struct ath10k will be set muti-elements to the
same ath10k_peer in ath10k_peer_map_event(). When peer delete failed
in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer
id in array peer_map of struct ath10k, and then use-after-free happened
for the 2nd peer id because they map to the same ath10k_peer.

And clean up all peers in array peer_map for the ath10k_peer, then
user-after-free disappeared

peer map event log:
[  306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e
[  306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33
[  306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246
[  306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198
[  306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166

peer unmap event log:
[  435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING)
[  435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone)
[  435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246
[  435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198
[  435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166

use-after-free log:
[21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING)
[21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110
[21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed
[21713.799968] ==================================================================
[21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core]
[21713.799991]
[21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69):
[21713.800010]  ath10k_sta_state+0x265/0xb8a [ath10k_core]
[21713.800041]  drv_sta_state+0x115/0x677 [mac80211]
[21713.800059]  __sta_info_destroy_part2+0xb1/0x133 [mac80211]
[21713.800076]  __sta_info_flush+0x11d/0x162 [mac80211]
[21713.800093]  ieee80211_set_disassoc+0x12d/0x2f4 [mac80211]
[21713.800110]  ieee80211_mgd_deauth+0x26c/0x29b [mac80211]
[21713.800137]  cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211]
[21713.800153]  nl80211_deauthenticate+0xf8/0x121 [cfg80211]
[21713.800161]  genl_rcv_msg+0x38e/0x3be
[21713.800166]  netlink_rcv_skb+0x89/0xf7
[21713.800171]  genl_rcv+0x28/0x36
[21713.800176]  netlink_unicast+0x179/0x24b
[21713.800181]  netlink_sendmsg+0x3a0/0x40e
[21713.800187]  sock_sendmsg+0x72/0x76
[21713.800192]  ____sys_sendmsg+0x16d/0x1e3
[21713.800196]  ___sys_sendmsg+0x95/0xd1
[21713.800200]  __sys_sendmsg+0x85/0xbf
[21713.800205]  do_syscall_64+0x43/0x55
[21713.800210]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[21713.800213]
[21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k
[21713.800219]
[21713.800224] allocated by task 13 on cpu 0 at 21705.501373s:
[21713.800241]  ath10k_peer_map_event+0x7e/0x154 [ath10k_core]
[21713.800254]  ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core]
[21713.800265]  ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core]
[21713.800277]  ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core]
[21713.800283]  ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci]
[21713.800294]  ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core]
[21713.800305]  ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core]
[21713.800310]  ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci]
[21713.800316]  net_rx_action+0xdc/0x361
[21713.800320]  __do_softirq+0x163/0x29a
[21713.800325]  asm_call_irq_on_stack+0x12/0x20
[21713.800331]  do_softirq_own_stack+0x3c/0x48
[21713.800337]  __irq_exit_rcu+0x9b/0x9d
[21713.800342]  common_interrupt+0xc9/0x14d
[21713.800346]  asm_common_interrupt+0x1e/0x40
[21713.800351]  ksoftirqd_should_run+0x5/0x16
[21713.800357]  smpboot_thread_fn+0x148/0x211
[21713.800362]  kthread+0x150/0x15f
[21713.800367]  ret_from_fork+0x22/0x30
[21713.800370]
[21713.800374] freed by task 708 on cpu 1 at 21713.799953s:
[21713.800498]  ath10k_sta_state+0x2c6/0xb8a [ath10k_core]
[21713.800515]  drv_sta_state+0x115/0x677 [mac80211]
[21713.800532]  __sta_info_destroy_part2+0xb1/0x133 [mac80211]
[21713.800548]  __sta_info_flush+0x11d/0x162 [mac80211]
[21713.800565]  ieee80211_set_disassoc+0x12d/0x2f4 [mac80211]
[21713.800581]  ieee80211_mgd_deauth+0x26c/0x29b [mac80211]
[21713.800598]  cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211]
[21713.800614]  nl80211_deauthenticate+0xf8/0x121 [cfg80211]
[21713.800619]  genl_rcv_msg+0x38e/0x3be
[21713.800623]  netlink_rcv_skb+0x89/0xf7
[21713.800628]  genl_rcv+0x28/0x36
[21713.800632]  netlink_unicast+0x179/0x24b
[21713.800637]  netlink_sendmsg+0x3a0/0x40e
[21713.800642]  sock_sendmsg+0x72/0x76
[21713.800646]  ____sys_sendmsg+0x16d/0x1e3
[21713.800651]  ___sys_sendmsg+0x95/0xd1
[21713.800655]  __sys_sendmsg+0x85/0xbf
[21713.800659]  do_syscall_64+0x43/0x55
[21713.800663]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1

Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.")
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220801141930.16794-1-quic_wgong@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Nov 2, 2022
…te()

[ Upstream commit f020d95 ]

When peer delete failed in a disconnect operation, use-after-free
detected by KFENCE in below log. It is because for each vdev_id and
address, it has only one struct ath10k_peer, it is allocated in
ath10k_peer_map_event(). When connected to an AP, it has more than
one HTT_T2H_MSG_TYPE_PEER_MAP reported from firmware, then the
array peer_map of struct ath10k will be set muti-elements to the
same ath10k_peer in ath10k_peer_map_event(). When peer delete failed
in ath10k_sta_state(), the ath10k_peer will be free for the 1st peer
id in array peer_map of struct ath10k, and then use-after-free happened
for the 2nd peer id because they map to the same ath10k_peer.

And clean up all peers in array peer_map for the ath10k_peer, then
user-after-free disappeared

peer map event log:
[  306.911021] wlan0: authenticate with b0:2a:43:e6:75:0e
[  306.957187] ath10k_pci 0000:01:00.0: mac vdev 0 peer create b0:2a:43:e6:75:0e (new sta) sta 1 / 32 peer 1 / 33
[  306.957395] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 246
[  306.957404] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 198
[  306.986924] ath10k_pci 0000:01:00.0: htt peer map vdev 0 peer b0:2a:43:e6:75:0e id 166

peer unmap event log:
[  435.715691] wlan0: deauthenticating from b0:2a:43:e6:75:0e by local choice (Reason: 3=DEAUTH_LEAVING)
[  435.716802] ath10k_pci 0000:01:00.0: mac vdev 0 peer delete b0:2a:43:e6:75:0e sta ffff990e0e9c2b50 (sta gone)
[  435.717177] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 246
[  435.717186] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 198
[  435.717193] ath10k_pci 0000:01:00.0: htt peer unmap vdev 0 peer b0:2a:43:e6:75:0e id 166

use-after-free log:
[21705.888627] wlan0: deauthenticating from d0:76:8f:82:be:75 by local choice (Reason: 3=DEAUTH_LEAVING)
[21713.799910] ath10k_pci 0000:01:00.0: failed to delete peer d0:76:8f:82:be:75 for vdev 0: -110
[21713.799925] ath10k_pci 0000:01:00.0: found sta peer d0:76:8f:82:be:75 (ptr 0000000000000000 id 102) entry on vdev 0 after it was supposedly removed
[21713.799968] ==================================================================
[21713.799991] BUG: KFENCE: use-after-free read in ath10k_sta_state+0x265/0xb8a [ath10k_core]
[21713.799991]
[21713.799997] Use-after-free read at 0x00000000abe1c75e (in kfence-#69):
[21713.800010]  ath10k_sta_state+0x265/0xb8a [ath10k_core]
[21713.800041]  drv_sta_state+0x115/0x677 [mac80211]
[21713.800059]  __sta_info_destroy_part2+0xb1/0x133 [mac80211]
[21713.800076]  __sta_info_flush+0x11d/0x162 [mac80211]
[21713.800093]  ieee80211_set_disassoc+0x12d/0x2f4 [mac80211]
[21713.800110]  ieee80211_mgd_deauth+0x26c/0x29b [mac80211]
[21713.800137]  cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211]
[21713.800153]  nl80211_deauthenticate+0xf8/0x121 [cfg80211]
[21713.800161]  genl_rcv_msg+0x38e/0x3be
[21713.800166]  netlink_rcv_skb+0x89/0xf7
[21713.800171]  genl_rcv+0x28/0x36
[21713.800176]  netlink_unicast+0x179/0x24b
[21713.800181]  netlink_sendmsg+0x3a0/0x40e
[21713.800187]  sock_sendmsg+0x72/0x76
[21713.800192]  ____sys_sendmsg+0x16d/0x1e3
[21713.800196]  ___sys_sendmsg+0x95/0xd1
[21713.800200]  __sys_sendmsg+0x85/0xbf
[21713.800205]  do_syscall_64+0x43/0x55
[21713.800210]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[21713.800213]
[21713.800219] kfence-#69: 0x000000009149b0d5-0x000000004c0697fb, size=1064, cache=kmalloc-2k
[21713.800219]
[21713.800224] allocated by task 13 on cpu 0 at 21705.501373s:
[21713.800241]  ath10k_peer_map_event+0x7e/0x154 [ath10k_core]
[21713.800254]  ath10k_htt_t2h_msg_handler+0x586/0x1039 [ath10k_core]
[21713.800265]  ath10k_htt_htc_t2h_msg_handler+0x12/0x28 [ath10k_core]
[21713.800277]  ath10k_htc_rx_completion_handler+0x14c/0x1b5 [ath10k_core]
[21713.800283]  ath10k_pci_process_rx_cb+0x195/0x1df [ath10k_pci]
[21713.800294]  ath10k_ce_per_engine_service+0x55/0x74 [ath10k_core]
[21713.800305]  ath10k_ce_per_engine_service_any+0x76/0x84 [ath10k_core]
[21713.800310]  ath10k_pci_napi_poll+0x49/0x144 [ath10k_pci]
[21713.800316]  net_rx_action+0xdc/0x361
[21713.800320]  __do_softirq+0x163/0x29a
[21713.800325]  asm_call_irq_on_stack+0x12/0x20
[21713.800331]  do_softirq_own_stack+0x3c/0x48
[21713.800337]  __irq_exit_rcu+0x9b/0x9d
[21713.800342]  common_interrupt+0xc9/0x14d
[21713.800346]  asm_common_interrupt+0x1e/0x40
[21713.800351]  ksoftirqd_should_run+0x5/0x16
[21713.800357]  smpboot_thread_fn+0x148/0x211
[21713.800362]  kthread+0x150/0x15f
[21713.800367]  ret_from_fork+0x22/0x30
[21713.800370]
[21713.800374] freed by task 708 on cpu 1 at 21713.799953s:
[21713.800498]  ath10k_sta_state+0x2c6/0xb8a [ath10k_core]
[21713.800515]  drv_sta_state+0x115/0x677 [mac80211]
[21713.800532]  __sta_info_destroy_part2+0xb1/0x133 [mac80211]
[21713.800548]  __sta_info_flush+0x11d/0x162 [mac80211]
[21713.800565]  ieee80211_set_disassoc+0x12d/0x2f4 [mac80211]
[21713.800581]  ieee80211_mgd_deauth+0x26c/0x29b [mac80211]
[21713.800598]  cfg80211_mlme_deauth+0x13f/0x1bb [cfg80211]
[21713.800614]  nl80211_deauthenticate+0xf8/0x121 [cfg80211]
[21713.800619]  genl_rcv_msg+0x38e/0x3be
[21713.800623]  netlink_rcv_skb+0x89/0xf7
[21713.800628]  genl_rcv+0x28/0x36
[21713.800632]  netlink_unicast+0x179/0x24b
[21713.800637]  netlink_sendmsg+0x3a0/0x40e
[21713.800642]  sock_sendmsg+0x72/0x76
[21713.800646]  ____sys_sendmsg+0x16d/0x1e3
[21713.800651]  ___sys_sendmsg+0x95/0xd1
[21713.800655]  __sys_sendmsg+0x85/0xbf
[21713.800659]  do_syscall_64+0x43/0x55
[21713.800663]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1

Fixes: d0eeafa ("ath10k: Clean up peer when sta goes away.")
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220801141930.16794-1-quic_wgong@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
amboar pushed a commit that referenced this pull request Aug 15, 2024
[ Upstream commit 7015843 ]

Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT
configs. In this particular case, the base VM is AMD with 166 cpus, and I
run selftests with regular qemu on top of that and indeed send_signal test
failed. I also tried with an Intel box with 80 cpus and there is no issue.

The main qemu command line includes:

  -enable-kvm -smp 16 -cpu host

The failure log looks like:

  $ ./test_progs -t send_signal
  [   48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225]
  [   48.503622] Modules linked in: bpf_testmod(O)
  [   48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G           O       6.9.0-08561-g2c1713a8f1c9-dirty #69
  [   48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
  [   48.511635] RIP: 0010:handle_softirqs+0x71/0x290
  [   48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb
  [   48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246
  [   48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0
  [   48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000
  [   48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000
  [   48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280
  [   48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  [   48.528525] FS:  00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000
  [   48.531600] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0
  [   48.537538] Call Trace:
  [   48.537538]  <IRQ>
  [   48.537538]  ? watchdog_timer_fn+0x1cd/0x250
  [   48.539590]  ? lockup_detector_update_enable+0x50/0x50
  [   48.539590]  ? __hrtimer_run_queues+0xff/0x280
  [   48.542520]  ? hrtimer_interrupt+0x103/0x230
  [   48.544524]  ? __sysvec_apic_timer_interrupt+0x4f/0x140
  [   48.545522]  ? sysvec_apic_timer_interrupt+0x3a/0x90
  [   48.547612]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [   48.547612]  ? handle_softirqs+0x71/0x290
  [   48.547612]  irq_exit_rcu+0x63/0x80
  [   48.551585]  sysvec_apic_timer_interrupt+0x75/0x90
  [   48.552521]  </IRQ>
  [   48.553529]  <TASK>
  [   48.553529]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [   48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260
  [   48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74
  [   48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282
  [   48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620
  [   48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00
  [   48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000
  [   48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000
  [   48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00
  [   48.575614]  ? finish_task_switch.isra.0+0x8d/0x260
  [   48.576523]  __schedule+0x364/0xac0
  [   48.577535]  schedule+0x2e/0x110
  [   48.578555]  pipe_read+0x301/0x400
  [   48.579589]  ? destroy_sched_domains_rcu+0x30/0x30
  [   48.579589]  vfs_read+0x2b3/0x2f0
  [   48.579589]  ksys_read+0x8b/0xc0
  [   48.583590]  do_syscall_64+0x3d/0xc0
  [   48.583590]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
  [   48.586525] RIP: 0033:0x7f2f28703fa1
  [   48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0
  [   48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
  [   48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1
  [   48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006
  [   48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000
  [   48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  [   48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0
  [   48.605527]  </TASK>

In the test, two processes are communicating through pipe. Further debugging
with strace found that the above splat is triggered as read() syscall could
not receive the data even if the corresponding write() syscall in another
process successfully wrote data into the pipe.

The failed subtest is "send_signal_perf". The corresponding perf event has
sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every
overflow event will trigger a call to the BPF program. So I suspect this may
overwhelm the system. So I increased the sample_period to 100,000 and the test
passed. The sample_period 10,000 still has the test failed.

In other parts of selftest, e.g., [1], sample_freq is used instead. So I
decided to use sample_freq = 1,000 since the test can pass as well.

  [1] https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org/

Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240605201203.2603846-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
amboar pushed a commit that referenced this pull request Nov 10, 2024
commit 409dc51 upstream.

The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has
32 LUTs, so correct the compatible string here, otherwise will
meet below error:

[    1.119072] ------------[ cut here ]------------
[    1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64
[    1.133239] Modules linked in:
[    1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 #69
[    1.146821] Hardware name: NXP i.MX8ULP EVK (DT)
[    1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64
[    1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64
[    1.168060] sp : ffff80008002b2a0
[    1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000
[    1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000
[    1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0
[    1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff
[    1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002
[    1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00
[    1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80
[    1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001
[    1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6
[    1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92
[    1.246267] Call trace:
[    1.248824]  nxp_fspi_exec_op+0xb04/0xb64
[    1.253031]  spi_mem_exec_op+0x3a0/0x430
[    1.257139]  spi_nor_read_id+0x80/0xcc
[    1.261065]  spi_nor_scan+0x1ec/0xf10
[    1.264901]  spi_nor_probe+0x108/0x2fc
[    1.268828]  spi_mem_probe+0x6c/0xbc
[    1.272574]  spi_probe+0x84/0xe4
[    1.275958]  really_probe+0xbc/0x29c
[    1.279713]  __driver_probe_device+0x78/0x12c
[    1.284277]  driver_probe_device+0xd8/0x15c
[    1.288660]  __device_attach_driver+0xb8/0x134
[    1.293316]  bus_for_each_drv+0x88/0xe8
[    1.297337]  __device_attach+0xa0/0x190
[    1.301353]  device_initial_probe+0x14/0x20
[    1.305734]  bus_probe_device+0xac/0xb0
[    1.309752]  device_add+0x5d0/0x790
[    1.313408]  __spi_add_device+0x134/0x204
[    1.317606]  of_register_spi_device+0x3b4/0x590
[    1.322348]  spi_register_controller+0x47c/0x754
[    1.327181]  devm_spi_register_controller+0x4c/0xa4
[    1.332289]  nxp_fspi_probe+0x1cc/0x2b0
[    1.336307]  platform_probe+0x68/0xc4
[    1.340145]  really_probe+0xbc/0x29c
[    1.343893]  __driver_probe_device+0x78/0x12c
[    1.348457]  driver_probe_device+0xd8/0x15c
[    1.352838]  __driver_attach+0x90/0x19c
[    1.356857]  bus_for_each_dev+0x7c/0xdc
[    1.360877]  driver_attach+0x24/0x30
[    1.364624]  bus_add_driver+0xe4/0x208
[    1.368552]  driver_register+0x5c/0x124
[    1.372573]  __platform_driver_register+0x28/0x34
[    1.377497]  nxp_fspi_driver_init+0x1c/0x28
[    1.381888]  do_one_initcall+0x80/0x1c8
[    1.385908]  kernel_init_freeable+0x1c4/0x28c
[    1.390472]  kernel_init+0x20/0x1d8
[    1.394138]  ret_from_fork+0x10/0x20
[    1.397885] ---[ end trace 0000000000000000 ]---
[    1.407908] ------------[ cut here ]------------

Fixes: ef89fd5 ("arm64: dts: imx8ulp: add flexspi node")
Cc: stable@kernel.org
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
thangqn-ampere pushed a commit to ampere-openbmc/linux that referenced this pull request Dec 26, 2024
commit 409dc51 upstream.

The flexspi on imx8ulp only has 16 LUTs, and imx8mm flexspi has
32 LUTs, so correct the compatible string here, otherwise will
meet below error:

[    1.119072] ------------[ cut here ]------------
[    1.123926] WARNING: CPU: 0 PID: 1 at drivers/spi/spi-nxp-fspi.c:855 nxp_fspi_exec_op+0xb04/0xb64
[    1.133239] Modules linked in:
[    1.136448] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc6-next-20240902-00001-g131bf9439dd9 openbmc#69
[    1.146821] Hardware name: NXP i.MX8ULP EVK (DT)
[    1.151647] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.158931] pc : nxp_fspi_exec_op+0xb04/0xb64
[    1.163496] lr : nxp_fspi_exec_op+0xa34/0xb64
[    1.168060] sp : ffff80008002b2a0
[    1.171526] x29: ffff80008002b2d0 x28: 0000000000000000 x27: 0000000000000000
[    1.179002] x26: ffff2eb645542580 x25: ffff800080610014 x24: ffff800080610000
[    1.186480] x23: ffff2eb645548080 x22: 0000000000000006 x21: ffff2eb6455425e0
[    1.193956] x20: 0000000000000000 x19: ffff80008002b5e0 x18: ffffffffffffffff
[    1.201432] x17: ffff2eb644467508 x16: 0000000000000138 x15: 0000000000000002
[    1.208907] x14: 0000000000000000 x13: ffff2eb6400d8080 x12: 00000000ffffff00
[    1.216378] x11: 0000000000000000 x10: ffff2eb6400d8080 x9 : ffff2eb697adca80
[    1.223850] x8 : ffff2eb697ad3cc0 x7 : 0000000100000000 x6 : 0000000000000001
[    1.231324] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000007a6
[    1.238795] x2 : 0000000000000000 x1 : 00000000000001ce x0 : 00000000ffffff92
[    1.246267] Call trace:
[    1.248824]  nxp_fspi_exec_op+0xb04/0xb64
[    1.253031]  spi_mem_exec_op+0x3a0/0x430
[    1.257139]  spi_nor_read_id+0x80/0xcc
[    1.261065]  spi_nor_scan+0x1ec/0xf10
[    1.264901]  spi_nor_probe+0x108/0x2fc
[    1.268828]  spi_mem_probe+0x6c/0xbc
[    1.272574]  spi_probe+0x84/0xe4
[    1.275958]  really_probe+0xbc/0x29c
[    1.279713]  __driver_probe_device+0x78/0x12c
[    1.284277]  driver_probe_device+0xd8/0x15c
[    1.288660]  __device_attach_driver+0xb8/0x134
[    1.293316]  bus_for_each_drv+0x88/0xe8
[    1.297337]  __device_attach+0xa0/0x190
[    1.301353]  device_initial_probe+0x14/0x20
[    1.305734]  bus_probe_device+0xac/0xb0
[    1.309752]  device_add+0x5d0/0x790
[    1.313408]  __spi_add_device+0x134/0x204
[    1.317606]  of_register_spi_device+0x3b4/0x590
[    1.322348]  spi_register_controller+0x47c/0x754
[    1.327181]  devm_spi_register_controller+0x4c/0xa4
[    1.332289]  nxp_fspi_probe+0x1cc/0x2b0
[    1.336307]  platform_probe+0x68/0xc4
[    1.340145]  really_probe+0xbc/0x29c
[    1.343893]  __driver_probe_device+0x78/0x12c
[    1.348457]  driver_probe_device+0xd8/0x15c
[    1.352838]  __driver_attach+0x90/0x19c
[    1.356857]  bus_for_each_dev+0x7c/0xdc
[    1.360877]  driver_attach+0x24/0x30
[    1.364624]  bus_add_driver+0xe4/0x208
[    1.368552]  driver_register+0x5c/0x124
[    1.372573]  __platform_driver_register+0x28/0x34
[    1.377497]  nxp_fspi_driver_init+0x1c/0x28
[    1.381888]  do_one_initcall+0x80/0x1c8
[    1.385908]  kernel_init_freeable+0x1c4/0x28c
[    1.390472]  kernel_init+0x20/0x1d8
[    1.394138]  ret_from_fork+0x10/0x20
[    1.397885] ---[ end trace 0000000000000000 ]---
[    1.407908] ------------[ cut here ]------------

Fixes: ef89fd5 ("arm64: dts: imx8ulp: add flexspi node")
Cc: stable@kernel.org
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amboar pushed a commit that referenced this pull request Jan 24, 2025
commit 9860370 upstream.

irq_chip functions may be called in raw spinlock context. Therefore, we
must also use a raw spinlock for our own internal locking.

This fixes the following lockdep splat:

[    5.349336] =============================
[    5.353349] [ BUG: Invalid wait context ]
[    5.357361] 6.13.0-rc5+ #69 Tainted: G        W
[    5.363031] -----------------------------
[    5.367045] kworker/u17:1/44 is trying to lock:
[    5.371587] ffffff88018b02c0 (&chip->gpio_lock){....}-{3:3}, at: xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8))
[    5.380079] other info that might help us debug this:
[    5.385138] context-{5:5}
[    5.387762] 5 locks held by kworker/u17:1/44:
[    5.392123] #0: ffffff8800014958 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3204)
[    5.402260] #1: ffffffc082fcbdd8 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:3205)
[    5.411528] #2: ffffff880172c900 (&dev->mutex){....}-{4:4}, at: __device_attach (drivers/base/dd.c:1006)
[    5.419929] #3: ffffff88039c8268 (request_class#2){+.+.}-{4:4}, at: __setup_irq (kernel/irq/internals.h:156 kernel/irq/manage.c:1596)
[    5.428331] #4: ffffff88039c80c8 (lock_class#2){....}-{2:2}, at: __setup_irq (kernel/irq/manage.c:1614)
[    5.436472] stack backtrace:
[    5.439359] CPU: 2 UID: 0 PID: 44 Comm: kworker/u17:1 Tainted: G        W          6.13.0-rc5+ #69
[    5.448690] Tainted: [W]=WARN
[    5.451656] Hardware name: xlnx,zynqmp (DT)
[    5.455845] Workqueue: events_unbound deferred_probe_work_func
[    5.461699] Call trace:
[    5.464147] show_stack+0x18/0x24 C
[    5.467821] dump_stack_lvl (lib/dump_stack.c:123)
[    5.471501] dump_stack (lib/dump_stack.c:130)
[    5.474824] __lock_acquire (kernel/locking/lockdep.c:4828 kernel/locking/lockdep.c:4898 kernel/locking/lockdep.c:5176)
[    5.478758] lock_acquire (arch/arm64/include/asm/percpu.h:40 kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851 kernel/locking/lockdep.c:5814)
[    5.482429] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
[    5.486797] xgpio_irq_unmask (drivers/gpio/gpio-xilinx.c:433 (discriminator 8))
[    5.490737] irq_enable (kernel/irq/internals.h:236 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345)
[    5.494060] __irq_startup (kernel/irq/internals.h:241 kernel/irq/chip.c:180 kernel/irq/chip.c:250)
[    5.497645] irq_startup (kernel/irq/chip.c:270)
[    5.501143] __setup_irq (kernel/irq/manage.c:1807)
[    5.504728] request_threaded_irq (kernel/irq/manage.c:2208)

Fixes: a32c7ca ("gpio: gpio-xilinx: Add interrupt support")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250110163354.2012654-1-sean.anderson@linux.dev
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants