Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N910H compatibility #1

Open
Highlander11 opened this issue Dec 20, 2014 · 2 comments
Open

N910H compatibility #1

Highlander11 opened this issue Dec 20, 2014 · 2 comments
Labels

Comments

@Highlander11
Copy link

AndreiLux, will this Perseus kernel be compatible with the Note 4 910H variant as well? If not, what are the main differences from the 910C that cause the incompatibility?

@AndreiLux
Copy link
Owner

I've seen that people have been using it without issues AFAIK. The inherent hardware should be the same.

@Highlander11
Copy link
Author

Thank you, Andrei! I will try it and report back....

@AndreiLux AndreiLux changed the title Compatibility N910H compatibility Dec 21, 2014
AndreiLux pushed a commit that referenced this issue Feb 16, 2015
There are a couple of seq_files which use the single_open() interface.
This interface requires that the whole output must fit into a single
buffer.

E.g.  for /proc/stat allocation failures have been observed because an
order-4 memory allocation failed due to memory fragmentation.  In such
situations reading /proc/stat is not possible anymore.

Therefore change the seq_file code to fallback to vmalloc allocations
which will usually result in a couple of order-0 allocations and hence
also work if memory is fragmented.

For reference a call trace where reading from /proc/stat failed:

  sadc: page allocation failure: order:4, mode:0x1040d0
  CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
  [...]
  Call Trace:
    show_stack+0x6c/0xe8
    warn_alloc_failed+0xd6/0x138
    __alloc_pages_nodemask+0x9da/0xb68
    __get_free_pages+0x2e/0x58
    kmalloc_order_trace+0x44/0xc0
    stat_open+0x5a/0xd8
    proc_reg_open+0x8a/0x140
    do_dentry_open+0x1bc/0x2c8
    finish_open+0x46/0x60
    do_last+0x382/0x10d0
    path_openat+0xc8/0x4f8
    do_filp_open+0x46/0xa8
    do_sys_open+0x114/0x1f0
    sysc_tracego+0x14/0x1a

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: David Rientjes <rientjes@google.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
Cc: Andrea Righi <andrea@betterlinux.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Conflicts:
	fs/seq_file.c

Change-Id: I009080dd017b020ffd5e812e5b472bdb8349217a
AndreiLux pushed a commit that referenced this issue Feb 16, 2015
An issue was observed when a userspace task exits.
The page which hits error here is the zero page.
In binder mmap, the whole of vma is not mapped.
On a task crash, when debuggerd reads the binder regions,
the unmapped areas fall to do_anonymous_page in handle_pte_fault,
due to the absence of a vm_fault handler. This results in
zero page being mapped. Later in zap_pte_range, vm_normal_page
returns zero page in the case of VM_MIXEDMAP and it results in the
error.

BUG: Bad page map in process mediaserver  pte:9dff379f pmd:9bfbd831
page:c0ed8e60 count:1 mapcount:-1 mapping:  (null) index:0x0
page flags: 0x404(referenced|reserved)
addr:40c3f000 vm_flags:10220051 anon_vma:  (null) mapping:d9fe0764 index:fd
vma->vm_ops->fault:   (null)
vma->vm_file->f_op->mmap: binder_mmap+0x0/0x274
CPU: 0 PID: 1463 Comm: mediaserver Tainted: G        W    3.10.17+ #1
[<c001549c>] (unwind_backtrace+0x0/0x11c) from [<c001200c>] (show_stack+0x10/0x14)
[<c001200c>] (show_stack+0x10/0x14) from [<c0103d78>] (print_bad_pte+0x158/0x190)
[<c0103d78>] (print_bad_pte+0x158/0x190) from [<c01055f0>] (unmap_single_vma+0x2e4/0x598)
[<c01055f0>] (unmap_single_vma+0x2e4/0x598) from [<c010618c>] (unmap_vmas+0x34/0x50)
[<c010618c>] (unmap_vmas+0x34/0x50) from [<c010a9e4>] (exit_mmap+0xc8/0x1e8)
[<c010a9e4>] (exit_mmap+0xc8/0x1e8) from [<c00520f0>] (mmput+0x54/0xd0)
[<c00520f0>] (mmput+0x54/0xd0) from [<c005972c>] (do_exit+0x360/0x990)
[<c005972c>] (do_exit+0x360/0x990) from [<c0059ef0>] (do_group_exit+0x84/0xc0)
[<c0059ef0>] (do_group_exit+0x84/0xc0) from [<c0066de0>] (get_signal_to_deliver+0x4d4/0x548)
[<c0066de0>] (get_signal_to_deliver+0x4d4/0x548) from [<c0011500>] (do_signal+0xa8/0x3b8)

Add a vm_fault handler which returns VM_FAULT_SIGBUS, and prevents the
wrong fallback to do_anonymous_page.

Change-Id: I43c227e489f74f4907f199caf99f571b61883064
Signed-off-by: Vinayak Menon <vinayakm.list@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Feb 16, 2015
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty #76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
(cherry picked from commit 32a71bce154cb89a549b9b7d28e8cf03b889d849)
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 7c17705 upstream.

Nikita Yuschenko reported that booting a kernel with init=/bin/sh and
then nfs mounting without portmap or rpcbind running using a busybox
mount resulted in:

  # mount -t nfs 10.30.130.21:/opt /mnt
  svc: failed to register lockdv1 RPC service (errno 111).
  lockd_up: makesock failed, error=-111
  Unable to handle kernel paging request for data at address 0x00000030
  Faulting instruction address: 0xc055e65c
  Oops: Kernel access of bad area, sig: 11 [#1]
  MPC85xx CDS
  Modules linked in:
  CPU: 0 PID: 1338 Comm: mount Not tainted 3.10.44.cge #117
  task: cf29cea0 ti: cf35c000 task.ti: cf35c000
  NIP: c055e65c LR: c0566490 CTR: c055e648
  REGS: cf35dad0 TRAP: 0300   Not tainted  (3.10.44.cge)
  MSR: 00029000 <CE,EE,ME>  CR: 22442488  XER: 20000000
  DEAR: 00000030, ESR: 00000000

  GPR00: c05606f4 cf35db80 cf29cea0 cf0ded80 cf0dedb8 00000001 1dec3086
  00000000
  GPR08: 00000000 c07b1640 00000007 1dec3086 22442482 100b9758 00000000
  10090ae8
  GPR16: 00000000 000186a5 00000000 00000000 100c3018 bfa46edc 100b0000
  bfa46ef0
  GPR24: cf386ae0 c07834f0 00000000 c0565f88 00000001 cf0dedb8 00000000
  cf0ded80
  NIP [c055e65c] call_start+0x14/0x34
  LR [c0566490] __rpc_execute+0x70/0x250
  Call Trace:
  [cf35db80] [00000080] 0x80 (unreliable)
  [cf35dbb0] [c05606f4] rpc_run_task+0x9c/0xc4
  [cf35dbc0] [c0560840] rpc_call_sync+0x50/0xb8
  [cf35dbf0] [c056ee90] rpcb_register_call+0x54/0x84
  [cf35dc10] [c056f24c] rpcb_register+0xf8/0x10c
  [cf35dc70] [c0569e18] svc_unregister.isra.23+0x100/0x108
  [cf35dc90] [c0569e38] svc_rpcb_cleanup+0x18/0x30
  [cf35dca0] [c0198c5c] lockd_up+0x1dc/0x2e0
  [cf35dcd0] [c0195348] nlmclnt_init+0x2c/0xc8
  [cf35dcf0] [c015bb5c] nfs_start_lockd+0x98/0xec
  [cf35dd20] [c015ce6c] nfs_create_server+0x1e8/0x3f4
  [cf35dd90] [c0171590] nfs3_create_server+0x10/0x44
  [cf35dda0] [c016528c] nfs_try_mount+0x158/0x1e4
  [cf35de20] [c01670d0] nfs_fs_mount+0x434/0x8c8
  [cf35de70] [c00cd3bc] mount_fs+0x20/0xbc
  [cf35de90] [c00e4f88] vfs_kern_mount+0x50/0x104
  [cf35dec0] [c00e6e0c] do_mount+0x1d0/0x8e0
  [cf35df10] [c00e75ac] SyS_mount+0x90/0xd0
  [cf35df40] [c000ccf4] ret_from_syscall+0x0/0x3c

The addition of svc_shutdown_net() resulted in two calls to
svc_rpcb_cleanup(); the second is no longer necessary and crashes when
it calls rpcb_register_call with clnt=NULL.

Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Fixes: 679b033 "lockd: ensure we tear down any live sockets when socket creation fails during lockd_up"
Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
…erspace

commit 1aa2b3b upstream.

Running an OABI_COMPAT kernel on an SMP platform can lead to fun and
games with page aging.

If one CPU issues a swi instruction immediately before another CPU
decides to mkold the page containing the swi instruction, then we will
fault attempting to load the instruction during the vector_swi handler
in order to retrieve its immediate field. Since this fault is not
currently dealt with by our exception tables, this results in a panic:

  Unable to handle kernel paging request at virtual address 4020841c
  pgd = c490c000
  [4020841c] *pgd=84451831, *pte=bf05859d, *ppte=00000000
  Internal error: Oops: 17 [#1] PREEMPT SMP ARM
  Modules linked in: hid_sony(O)
  CPU: 1    Tainted: G        W  O  (3.4.0-perf-gf496dca-01162-gcbcc62b #1)
  PC is at vector_swi+0x28/0x88
  LR is at 0x40208420

This patch wraps all of the swi instruction loads with the USER macro
and provides a shared exception table entry which simply rewinds the
saved user PC and returns from the system call (without setting tbl, so
there's no worries with tracing or syscall restarting). Returning to
userspace will re-enter the page fault handler, from where we will
probably send SIGSEGV to the current task.

Reported-by: Wang, Yalin <yalin.wang@sonymobile.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
[ Upstream commit eed4d83 ]

Use dst_entry held by sk_dst_get() to retrieve tunnel's PMTU.

The dst_mtu(__sk_dst_get(tunnel->sock)) call was racy. __sk_dst_get()
could return NULL if tunnel->sock->sk_dst_cache was reset just before the
call, thus making dst_mtu() dereference a NULL pointer:

[ 1937.661598] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 1937.664005] IP: [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005] PGD daf0c067 PUD d9f93067 PMD 0
[ 1937.664005] Oops: 0000 [#1] SMP
[ 1937.664005] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables udp_tunnel pppoe pppox ppp_generic slhc deflate ctr twofish_generic twofish_x86_64_3way xts lrw gf128mul glue_helper twofish_x86_64 twofish_common blowfish_generic blowfish_x86_64 blowfish_common des_generic cbc xcbc rmd160 sha512_generic hmac crypto_null af_key xfrm_algo 8021q garp bridge stp llc tun atmtcp clip atm ext3 mbcache jbd iTCO_wdt coretemp kvm_intel iTCO_vendor_support kvm pcspkr evdev ehci_pci lpc_ich mfd_core i5400_edac edac_core i5k_amb shpchp button processor thermal_sys xfs crc32c_generic libcrc32c dm_mod usbhid sg hid sr_mod sd_mod cdrom crc_t10dif crct10dif_common ata_generic ahci ata_piix tg3 libahci libata uhci_hcd ptp ehci_hcd pps_core usbcore scsi_mod libphy usb_common [last unloaded: l2tp_core]
[ 1937.664005] CPU: 0 PID: 10022 Comm: l2tpstress Tainted: G           O   3.17.0-rc1 #1
[ 1937.664005] Hardware name: HP ProLiant DL160 G5, BIOS O12 08/22/2008
[ 1937.664005] task: ffff8800d8fda790 ti: ffff8800c43c4000 task.ti: ffff8800c43c4000
[ 1937.664005] RIP: 0010:[<ffffffffa049db88>]  [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005] RSP: 0018:ffff8800c43c7de8  EFLAGS: 00010282
[ 1937.664005] RAX: ffff8800da8a7240 RBX: ffff8800d8c64600 RCX: 000001c325a137b5
[ 1937.664005] RDX: 8c6318c6318c6320 RSI: 000000000000010c RDI: 0000000000000000
[ 1937.664005] RBP: ffff8800c43c7ea8 R08: 0000000000000000 R09: 0000000000000000
[ 1937.664005] R10: ffffffffa048e2c0 R11: ffff8800d8c64600 R12: ffff8800ca7a5000
[ 1937.664005] R13: ffff8800c439bf40 R14: 000000000000000c R15: 0000000000000009
[ 1937.664005] FS:  00007fd7f610f700(0000) GS:ffff88011a600000(0000) knlGS:0000000000000000
[ 1937.664005] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1937.664005] CR2: 0000000000000020 CR3: 00000000d9d75000 CR4: 00000000000027e0
[ 1937.664005] Stack:
[ 1937.664005]  ffffffffa049da80 ffff8800d8fda790 000000000000005b ffff880000000009
[ 1937.664005]  ffff8800daf3f200 0000000000000003 ffff8800c43c7e48 ffffffff81109b57
[ 1937.664005]  ffffffff81109b0e ffffffff8114c566 0000000000000000 0000000000000000
[ 1937.664005] Call Trace:
[ 1937.664005]  [<ffffffffa049da80>] ? pppol2tp_connect+0x235/0x41e [l2tp_ppp]
[ 1937.664005]  [<ffffffff81109b57>] ? might_fault+0x9e/0xa5
[ 1937.664005]  [<ffffffff81109b0e>] ? might_fault+0x55/0xa5
[ 1937.664005]  [<ffffffff8114c566>] ? rcu_read_unlock+0x1c/0x26
[ 1937.664005]  [<ffffffff81309196>] SYSC_connect+0x87/0xb1
[ 1937.664005]  [<ffffffff813e56f7>] ? sysret_check+0x1b/0x56
[ 1937.664005]  [<ffffffff8107590d>] ? trace_hardirqs_on_caller+0x145/0x1a1
[ 1937.664005]  [<ffffffff81213dee>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 1937.664005]  [<ffffffff8114c262>] ? spin_lock+0x9/0xb
[ 1937.664005]  [<ffffffff813092b4>] SyS_connect+0x9/0xb
[ 1937.664005]  [<ffffffff813e56d2>] system_call_fastpath+0x16/0x1b
[ 1937.664005] Code: 10 2a 84 81 e8 65 76 bd e0 65 ff 0c 25 10 bb 00 00 4d 85 ed 74 37 48 8b 85 60 ff ff ff 48 8b 80 88 01 00 00 48 8b b8 10 02 00 00 <48> 8b 47 20 ff 50 20 85 c0 74 0f 83 e8 28 89 83 10 01 00 00 89
[ 1937.664005] RIP  [<ffffffffa049db88>] pppol2tp_connect+0x33d/0x41e [l2tp_ppp]
[ 1937.664005]  RSP <ffff8800c43c7de8>
[ 1937.664005] CR2: 0000000000000020
[ 1939.559375] ---[ end trace 82d44500f28f8708 ]---

Fixes: f34c4a3 ("l2tp: take PMTU from tunnel UDP socket")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
The following happens when trying to run a kvm guest on a kernel
configured for 64k pages. This doesn't happen with 4k pages:

  BUG: failure at include/linux/mm.h:297/put_page_testzero()!
  Kernel panic - not syncing: BUG!
  CPU: 2 PID: 4228 Comm: qemu-system-aar Tainted: GF            3.13.0-0.rc7.31.sa2.k32v1.aarch64.debug #1
  Call trace:
  [<fffffe0000096034>] dump_backtrace+0x0/0x16c
  [<fffffe00000961b4>] show_stack+0x14/0x1c
  [<fffffe000066e648>] dump_stack+0x84/0xb0
  [<fffffe0000668678>] panic+0xf4/0x220
  [<fffffe000018ec78>] free_reserved_area+0x0/0x110
  [<fffffe000018edd8>] free_pages+0x50/0x88
  [<fffffe00000a759c>] kvm_free_stage2_pgd+0x30/0x40
  [<fffffe00000a5354>] kvm_arch_destroy_vm+0x18/0x44
  [<fffffe00000a1854>] kvm_put_kvm+0xf0/0x184
  [<fffffe00000a1938>] kvm_vm_release+0x10/0x1c
  [<fffffe00001edc1c>] __fput+0xb0/0x288
  [<fffffe00001ede4c>] ____fput+0xc/0x14
  [<fffffe00000d5a2c>] task_work_run+0xa8/0x11c
  [<fffffe0000095c14>] do_notify_resume+0x54/0x58

In arch/arm/kvm/mmu.c:unmap_range(), we end up doing an extra put_page()
on the stage2 pgd which leads to the BUG in put_page_testzero(). This
happens because a pud_huge() test in unmap_range() returns true when it
should always be false with 2-level pages tables used by 64k pages.
This patch removes support for huge puds if 2-level pagetables are
being used.

Signed-off-by: Mark Salter <msalter@redhat.com>
[catalin.marinas@arm.com: removed #ifndef around PUD_SIZE check]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org> # v3.11+

(cherry picked from commit 4797ec2)
Signed-off-by: Mark Brown <broonie@kernel.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 35425ea upstream.

Christopher Head 2014-06-28 05:26:20 UTC described:
"I tried to reproduce this on 3.12.21. Instead, when I do "echo hello > foo"
in an ecryptfs mount with ecryptfs_xattr specified, I get a kernel crash:

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61
PGD d7840067 PUD b2c3c067 PMD 0
Oops: 0002 [#1] SMP
Modules linked in: nvidia(PO)
CPU: 3 PID: 3566 Comm: bash Tainted: P           O 3.12.21-gentoo-r1 #2
Hardware name: ASUSTek Computer Inc. G60JX/G60JX, BIOS 206 03/15/2010
task: ffff8801948944c0 ti: ffff8800bad70000 task.ti: ffff8800bad70000
RIP: 0010:[<ffffffff8110eb39>]  [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61
RSP: 0018:ffff8800bad71c10  EFLAGS: 00010246
RAX: 00000000000181a4 RBX: ffff880198648480 RCX: 0000000000000000
RDX: 0000000000000004 RSI: ffff880172010450 RDI: 0000000000000000
RBP: ffff880198490e40 R08: 0000000000000000 R09: 0000000000000000
R10: ffff880172010450 R11: ffffea0002c51e80 R12: 0000000000002000
R13: 000000000000001a R14: 0000000000000000 R15: ffff880198490e40
FS:  00007ff224caa700(0000) GS:ffff88019fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000bb07f000 CR4: 00000000000007e0
Stack:
ffffffff811826e8 ffff8800a39d8000 0000000000000000 000000000000001a
ffff8800a01d0000 ffff8800a39d8000 ffffffff81185fd5 ffffffff81082c2c
00000001a39d8000 53d0abbc98490e40 0000000000000037 ffff8800a39d8220
Call Trace:
[<ffffffff811826e8>] ? ecryptfs_setxattr+0x40/0x52
[<ffffffff81185fd5>] ? ecryptfs_write_metadata+0x1b3/0x223
[<ffffffff81082c2c>] ? should_resched+0x5/0x23
[<ffffffff8118322b>] ? ecryptfs_initialize_file+0xaf/0xd4
[<ffffffff81183344>] ? ecryptfs_create+0xf4/0x142
[<ffffffff810f8c0d>] ? vfs_create+0x48/0x71
[<ffffffff810f9c86>] ? do_last.isra.68+0x559/0x952
[<ffffffff810f7ce7>] ? link_path_walk+0xbd/0x458
[<ffffffff810fa2a3>] ? path_openat+0x224/0x472
[<ffffffff810fa7bd>] ? do_filp_open+0x2b/0x6f
[<ffffffff81103606>] ? __alloc_fd+0xd6/0xe7
[<ffffffff810ee6ab>] ? do_sys_open+0x65/0xe9
[<ffffffff8157d022>] ? system_call_fastpath+0x16/0x1b
RIP  [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61
RSP <ffff8800bad71c10>
CR2: 0000000000000000
---[ end trace df9dba5f1ddb8565 ]---"

If we create a file when we mount with ecryptfs_xattr_metadata option, we will
encounter a crash in this path:
->ecryptfs_create
  ->ecryptfs_initialize_file
    ->ecryptfs_write_metadata
      ->ecryptfs_write_metadata_to_xattr
        ->ecryptfs_setxattr
          ->fsstack_copy_attr_all
It's because our dentry->d_inode used in fsstack_copy_attr_all is NULL, and it
will be initialized when ecryptfs_initialize_file finish.

So we should skip copying attr from lower inode when the value of ->d_inode is
invalid.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 086ba77 upstream.

ARM has some private syscalls (for example, set_tls(2)) which lie
outside the range of NR_syscalls.  If any of these are called while
syscall tracing is being performed, out-of-bounds array access will
occur in the ftrace and perf sys_{enter,exit} handlers.

 # trace-cmd record -e raw_syscalls:* true && trace-cmd report
 ...
 true-653   [000]   384.675777: sys_enter:            NR 192 (0, 1000, 3, 4000022, ffffffff, 0)
 true-653   [000]   384.675812: sys_exit:             NR 192 = 1995915264
 true-653   [000]   384.675971: sys_enter:            NR 983045 (76f74480, 76f74000, 76f74b28, 76f74480, 76f76f74, 1)
 true-653   [000]   384.675988: sys_exit:             NR 983045 = 0
 ...

 # trace-cmd record -e syscalls:* true
 [   17.289329] Unable to handle kernel paging request at virtual address aaaaaace
 [   17.289590] pgd = 9e71c000
 [   17.289696] [aaaaaace] *pgd=00000000
 [   17.289985] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
 [   17.290169] Modules linked in:
 [   17.290391] CPU: 0 PID: 704 Comm: true Not tainted 3.18.0-rc2+ #21
 [   17.290585] task: 9f4dab00 ti: 9e710000 task.ti: 9e710000
 [   17.290747] PC is at ftrace_syscall_enter+0x48/0x1f8
 [   17.290866] LR is at syscall_trace_enter+0x124/0x184

Fix this by ignoring out-of-NR_syscalls-bounds syscall numbers.

Commit cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
added the check for less than zero, but it should have also checked
for greater than NR_syscalls.

Link: http://lkml.kernel.org/p/1414620418-29472-1-git-send-email-rabin@rab.in

Fixes: cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 3b1deef upstream.

evm_inode_setxattr() can be called with no value. The function does not
check the length so that following command can be used to produce the
kernel oops: setfattr -n security.evm FOO. This patch fixes it.

Changes in v3:
* there is no reason to return different error codes for EVM_XATTR_HMAC
  and non EVM_XATTR_HMAC. Remove unnecessary test then.

Changes in v2:
* testing for validity of xattr type

[ 1106.396921] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 1106.398192] IP: [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.399244] PGD 29048067 PUD 290d7067 PMD 0
[ 1106.399953] Oops: 0000 [#1] SMP
[ 1106.400020] Modules linked in: bridge stp llc evdev serio_raw i2c_piix4 button fuse
[ 1106.400020] CPU: 0 PID: 3635 Comm: setxattr Not tainted 3.16.0-kds+ #2936
[ 1106.400020] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 1106.400020] task: ffff8800291a0000 ti: ffff88002917c000 task.ti: ffff88002917c000
[ 1106.400020] RIP: 0010:[<ffffffff812af7b8>]  [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.400020] RSP: 0018:ffff88002917fd50  EFLAGS: 00010246
[ 1106.400020] RAX: 0000000000000000 RBX: ffff88002917fdf8 RCX: 0000000000000000
[ 1106.400020] RDX: 0000000000000000 RSI: ffffffff818136d3 RDI: ffff88002917fdf8
[ 1106.400020] RBP: ffff88002917fd68 R08: 0000000000000000 R09: 00000000003ec1df
[ 1106.400020] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800438a0a00
[ 1106.400020] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1106.400020] FS:  00007f7dfa7d7740(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000
[ 1106.400020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1106.400020] CR2: 0000000000000000 CR3: 000000003763e000 CR4: 00000000000006f0
[ 1106.400020] Stack:
[ 1106.400020]  ffff8800438a0a00 ffff88002917fdf8 0000000000000000 ffff88002917fd98
[ 1106.400020]  ffffffff812a1030 ffff8800438a0a00 ffff88002917fdf8 0000000000000000
[ 1106.400020]  0000000000000000 ffff88002917fde0 ffffffff8116d08a ffff88002917fdc8
[ 1106.400020] Call Trace:
[ 1106.400020]  [<ffffffff812a1030>] security_inode_setxattr+0x5d/0x6a
[ 1106.400020]  [<ffffffff8116d08a>] vfs_setxattr+0x6b/0x9f
[ 1106.400020]  [<ffffffff8116d1e0>] setxattr+0x122/0x16c
[ 1106.400020]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
[ 1106.400020]  [<ffffffff8114d011>] ? __sb_start_write+0x10f/0x143
[ 1106.400020]  [<ffffffff811687e8>] ? mnt_want_write+0x21/0x45
[ 1106.400020]  [<ffffffff811687c0>] ? __mnt_want_write+0x48/0x4f
[ 1106.400020]  [<ffffffff8116d3e6>] SyS_setxattr+0x6e/0xb0
[ 1106.400020]  [<ffffffff81529da9>] system_call_fastpath+0x16/0x1b
[ 1106.400020] Code: c3 0f 1f 44 00 00 55 48 89 e5 41 55 49 89 d5 41 54 49 89 fc 53 48 89 f3 48 c7 c6 d3 36 81 81 48 89 df e8 18 22 04 00 85 c0 75 07 <41> 80 7d 00 02 74 0d 48 89 de 4c 89 e7 e8 5a fe ff ff eb 03 83
[ 1106.400020] RIP  [<ffffffff812af7b8>] evm_inode_setxattr+0x2a/0x48
[ 1106.400020]  RSP <ffff88002917fd50>
[ 1106.400020] CR2: 0000000000000000
[ 1106.428061] ---[ end trace ae08331628ba3050 ]---

Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 1bf1890 upstream.

I ran into this error after a ubiupdatevol, because I forgot to backport
e911036 UBI: fix the volumes tree sorting criteria.

UBI error: process_pool_aeb: orphaned volume in fastmap pool
UBI error: ubi_scan_fastmap: Attach by fastmap failed, doing a full scan!
kmem_cache_destroy ubi_ainf_peb_slab: Slab cache still has objects
CPU: 0 PID: 1 Comm: swapper Not tainted 3.14.18-00053-gf05cac8dbf85 #1
[<c000d298>] (unwind_backtrace) from [<c000baa8>] (show_stack+0x10/0x14)
[<c000baa8>] (show_stack) from [<c01b7a68>] (destroy_ai+0x230/0x244)
[<c01b7a68>] (destroy_ai) from [<c01b8fd4>] (ubi_attach+0x98/0x1ec)
[<c01b8fd4>] (ubi_attach) from [<c01ade90>] (ubi_attach_mtd_dev+0x2b8/0x868)
[<c01ade90>] (ubi_attach_mtd_dev) from [<c038b510>] (ubi_init+0x1dc/0x2ac)
[<c038b510>] (ubi_init) from [<c0008860>] (do_one_initcall+0x94/0x140)
[<c0008860>] (do_one_initcall) from [<c037aadc>] (kernel_init_freeable+0xe8/0x1b0)
[<c037aadc>] (kernel_init_freeable) from [<c02730ac>] (kernel_init+0x8/0xe4)
[<c02730ac>] (kernel_init) from [<c00093f0>] (ret_from_fork+0x14/0x24)
UBI: scanning is finished

Freeing the cache in the error path fixes the Slab error.

Tested on at91sam9g35 (3.14.18+fastmap backports)

Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit d3051b4 upstream.

A panic was seen in the following sitation.

There are two threads running on the system. The first thread is a system
monitoring thread that is reading /proc/modules. The second thread is
loading and unloading a module (in this example I'm using my simple
dummy-module.ko).  Note, in the "real world" this occurred with the qlogic
driver module.

When doing this, the following panic occurred:

 ------------[ cut here ]------------
 kernel BUG at kernel/module.c:3739!
 invalid opcode: 0000 [#1] SMP
 Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
 CPU: 37 PID: 186343 Comm: cat Tainted: GF          O--------------   3.10.0+ #7
 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
 task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
 RIP: 0010:[<ffffffff810d64c5>]  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
 RSP: 0018:ffff88080fa7fe18  EFLAGS: 00010246
 RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
 RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
 RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
 R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
 R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
 FS:  00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Stack:
  ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
  ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
  ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
 Call Trace:
  [<ffffffff810d666c>] m_show+0x19c/0x1e0
  [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0
  [<ffffffff812281ed>] proc_reg_read+0x3d/0x80
  [<ffffffff811c0f2c>] vfs_read+0x9c/0x170
  [<ffffffff811c1a58>] SyS_read+0x58/0xb0
  [<ffffffff81605829>] system_call_fastpath+0x16/0x1b
 Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
 RIP  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
  RSP <ffff88080fa7fe18>

    Consider the two processes running on the system.

    CPU 0 (/proc/modules reader)
    CPU 1 (loading/unloading module)

    CPU 0 opens /proc/modules, and starts displaying data for each module by
    traversing the modules list via fs/seq_file.c:seq_open() and
    fs/seq_file.c:seq_read().  For each module in the modules list, seq_read
    does

            op->start()  <-- this is a pointer to m_start()
            op->show()   <- this is a pointer to m_show()
            op->stop()   <-- this is a pointer to m_stop()

    The m_start(), m_show(), and m_stop() module functions are defined in
    kernel/module.c. The m_start() and m_stop() functions acquire and release
    the module_mutex respectively.

    ie) When reading /proc/modules, the module_mutex is acquired and released
    for each module.

    m_show() is called with the module_mutex held.  It accesses the module
    struct data and attempts to write out module data.  It is in this code
    path that the above BUG_ON() warning is encountered, specifically m_show()
    calls

    static char *module_flags(struct module *mod, char *buf)
    {
            int bx = 0;

            BUG_ON(mod->state == MODULE_STATE_UNFORMED);
    ...

    The other thread, CPU 1, in unloading the module calls the syscall
    delete_module() defined in kernel/module.c.  The module_mutex is acquired
    for a short time, and then released.  free_module() is called without the
    module_mutex.  free_module() then sets mod->state = MODULE_STATE_UNFORMED,
    also without the module_mutex.  Some additional code is called and then the
    module_mutex is reacquired to remove the module from the modules list:

        /* Now we can delete it from the lists */
        mutex_lock(&module_mutex);
        stop_machine(__unlink_module, mod, NULL);
        mutex_unlock(&module_mutex);

This is the sequence of events that leads to the panic.

CPU 1 is removing dummy_module via delete_module().  It acquires the
module_mutex, and then releases it.  CPU 1 has NOT set dummy_module->state to
MODULE_STATE_UNFORMED yet.

CPU 0, which is reading the /proc/modules, acquires the module_mutex and
acquires a pointer to the dummy_module which is still in the modules list.
CPU 0 calls m_show for dummy_module.  The check in m_show() for
MODULE_STATE_UNFORMED passed for dummy_module even though it is being
torn down.

Meanwhile CPU 1, which has been continuing to remove dummy_module without
holding the module_mutex, now calls free_module() and sets
dummy_module->state to MODULE_STATE_UNFORMED.

CPU 0 now calls module_flags() with dummy_module and ...

static char *module_flags(struct module *mod, char *buf)
{
        int bx = 0;

        BUG_ON(mod->state == MODULE_STATE_UNFORMED);

and BOOM.

Acquire and release the module_mutex lock around the setting of
MODULE_STATE_UNFORMED in the teardown path, which should resolve the
problem.

Testing: In the unpatched kernel I can panic the system within 1 minute by
doing

while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done

and

while (true) do cat /proc/modules; done

in separate terminals.

In the patched kernel I was able to run just over one hour without seeing
any issues.  I also verified the output of panic via sysrq-c and the output
of /proc/modules looks correct for all three states for the dummy_module.

        dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
        dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
        dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 824269c upstream.

In older versions of the IIO framework it was possible to pass a
completely different set of channels to iio_buffer_register() as the one
that is assigned to the IIO device. Commit 959d295 ("staging:iio: make
iio_sw_buffer_preenable much more general.") introduced a restriction that
requires that the set of channels that is passed to iio_buffer_register() is
a subset of the channels assigned to the IIO device as the IIO core will use
the list of channels that is assigned to the device to lookup a channel by
scan index in iio_compute_scan_bytes(). If it can not find the channel the
function will crash. This patch fixes the issue by making sure that the same
set of channels is assigned to the IIO device and passed to
iio_buffer_register().

Fixes the follow NULL pointer derefernce kernel crash:
	Unable to handle kernel NULL pointer dereference at virtual address 00000016
	pgd = d53d0000
	[00000016] *pgd=1534e831, *pte=00000000, *ppte=00000000
	Internal error: Oops: 17 [#1] PREEMPT SMP ARM
	Modules linked in:
	CPU: 1 PID: 1626 Comm: bash Not tainted 3.15.0-19969-g2a180eb-dirty #9545
	task: d6c124c0 ti: d539a000 task.ti: d539a000
	PC is at iio_compute_scan_bytes+0x34/0xa8
	LR is at iio_compute_scan_bytes+0x34/0xa8
	pc : [<c03052e4>]    lr : [<c03052e4>]    psr: 60070013
	sp : d539beb8  ip : 00000001  fp : 00000000
	r10: 00000002  r9 : 00000000  r8 : 00000001
	r7 : 00000000  r6 : d6dc8800  r5 : d7571000  r4 : 00000002
	r3 : d7571000  r2 : 00000044  r1 : 00000001  r0 : 00000000
	Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
	Control: 18c5387d  Table: 153d004a  DAC: 00000015
	Process bash (pid: 1626, stack limit = 0xd539a240)
	Stack: (0xd539beb8 to 0xd539c000)
	bea0:                                                       c02fc0e4 d7571000
	bec0: d76c1640 d6dc8800 d757117c 00000000 d757112c c0305b04 d76c1690 d76c1640
	bee0: d7571188 00000002 00000000 d7571000 d539a000 00000000 000dd1c8 c0305d54
	bf00: d7571010 0160b868 00000002 c69d3900 d7573278 d7573308 c69d3900 c01ece90
	bf20: 00000002 c0103fac c0103f6c d539bf88 00000002 c69d3b00 c69d3b0c c0103468
	bf40: 00000000 00000000 d7694a00 00000002 000af408 d539bf88 c000dd84 c00b2f94
	bf60: d7694a00 000af408 00000002 d7694a00 d7694a00 00000002 000af408 c000dd84
	bf80: 00000000 c00b32d0 00000000 00000000 00000002 b6f1aa78 00000002 000af408
	bfa0: 00000004 c000dc00 b6f1aa78 00000002 00000001 000af408 00000002 00000000
	bfc0: b6f1aa78 00000002 000af408 00000004 be806a4c 000a6094 00000000 000dd1c8
	bfe0: 00000000 be8069cc b6e8ab77 b6ec125c 40070010 00000001 22940489 154a5007
	[<c03052e4>] (iio_compute_scan_bytes) from [<c0305b04>] (__iio_update_buffers+0x248/0x438)
	[<c0305b04>] (__iio_update_buffers) from [<c0305d54>] (iio_buffer_store_enable+0x60/0x7c)
	[<c0305d54>] (iio_buffer_store_enable) from [<c01ece90>] (dev_attr_store+0x18/0x24)
	[<c01ece90>] (dev_attr_store) from [<c0103fac>] (sysfs_kf_write+0x40/0x4c)
	[<c0103fac>] (sysfs_kf_write) from [<c0103468>] (kernfs_fop_write+0x110/0x154)
	[<c0103468>] (kernfs_fop_write) from [<c00b2f94>] (vfs_write+0xd0/0x160)
	[<c00b2f94>] (vfs_write) from [<c00b32d0>] (SyS_write+0x40/0x78)
	[<c00b32d0>] (SyS_write) from [<c000dc00>] (ret_fast_syscall+0x0/0x30)
	Code: ea00000e e1a01008 e1a00005 ebfff6fc (e5d0a016)

Fixes: 959d295 ("staging:iio: make iio_sw_buffer_preenable much more general.")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit e105547 upstream.

In older versions of the IIO framework it was possible to pass a completely
different set of channels to iio_buffer_register() as the one that is
assigned to the IIO device. Commit 959d295 ("staging:iio: make
iio_sw_buffer_preenable much more general.") introduced a restriction that
requires that the set of channels that is passed to iio_buffer_register() is
a subset of the channels assigned to the IIO device as the IIO core will use
the list of channels that is assigned to the device to lookup a channel by
scan index in iio_compute_scan_bytes(). If it can not find the channel the
function will crash. This patch fixes the issue by making sure that the same
set of channels is assigned to the IIO device and passed to
iio_buffer_register().

Note that we need to remove the IIO_CHAN_INFO_RAW and IIO_CHAN_INFO_SCALE
info attributes from the channels since we don't actually want those to be
registered.

Fixes the following crash:
	Unable to handle kernel NULL pointer dereference at virtual address 00000016
	pgd = d2094000
	[00000016] *pgd=16e39831, *pte=00000000, *ppte=00000000
	Internal error: Oops: 17 [#1] PREEMPT SMP ARM
	Modules linked in:
	CPU: 1 PID: 1695 Comm: bash Not tainted 3.17.0-06329-g29461ee #9686
	task: d7768040 ti: d5bd4000 task.ti: d5bd4000
	PC is at iio_compute_scan_bytes+0x38/0xc0
	LR is at iio_compute_scan_bytes+0x34/0xc0
	pc : [<c0316de8>]    lr : [<c0316de4>]    psr: 60070013
	sp : d5bd5ec0  ip : 00000000  fp : 00000000
	r10: d769f934  r9 : 00000000  r8 : 00000001
	r7 : 00000000  r6 : c8fc6240  r5 : d769f800  r4 : 00000000
	r3 : d769f800  r2 : 00000000  r1 : ffffffff  r0 : 00000000
	Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
	Control: 18c5387d  Table: 1209404a  DAC: 00000015
	Process bash (pid: 1695, stack limit = 0xd5bd4240)
	Stack: (0xd5bd5ec0 to 0xd5bd6000)
	5ec0: d769f800 d7435640 c8fc6240 d769f984 00000000 c03175a4 d7435690 d7435640
	5ee0: d769f990 00000002 00000000 d769f800 d5bd4000 00000000 000b43a8 c03177f4
	5f00: d769f810 0162b8c8 00000002 c8fc7e00 d77f1d08 d77f1da8 c8fc7e00 c01faf1c
	5f20: 00000002 c010694c c010690c d5bd5f88 00000002 c8fc6840 c8fc684c c0105e08
	5f40: 00000000 00000000 d20d1580 00000002 000af408 d5bd5f88 c000de84 c00b76d4
	5f60: d20d1580 000af408 00000002 d20d1580 d20d1580 00000002 000af408 c000de84
	5f80: 00000000 c00b7a44 00000000 00000000 00000002 b6ebea78 00000002 000af408
	5fa0: 00000004 c000dd00 b6ebea78 00000002 00000001 000af408 00000002 00000000
	5fc0: b6ebea78 00000002 000af408 00000004 bee96a4c 000a6094 00000000 000b43a8
	5fe0: 00000000 bee969cc b6e2eb77 b6e6525c 40070010 00000001 00000000 00000000
	[<c0316de8>] (iio_compute_scan_bytes) from [<c03175a4>] (__iio_update_buffers+0x248/0x438)
	[<c03175a4>] (__iio_update_buffers) from [<c03177f4>] (iio_buffer_store_enable+0x60/0x7c)
	[<c03177f4>] (iio_buffer_store_enable) from [<c01faf1c>] (dev_attr_store+0x18/0x24)
	[<c01faf1c>] (dev_attr_store) from [<c010694c>] (sysfs_kf_write+0x40/0x4c)
	[<c010694c>] (sysfs_kf_write) from [<c0105e08>] (kernfs_fop_write+0x110/0x154)
	[<c0105e08>] (kernfs_fop_write) from [<c00b76d4>] (vfs_write+0xbc/0x170)
	[<c00b76d4>] (vfs_write) from [<c00b7a44>] (SyS_write+0x40/0x78)
	[<c00b7a44>] (SyS_write) from [<c000dd00>] (ret_fast_syscall+0x0/0x30)

Fixes: 959d295 ("staging:iio: make iio_sw_buffer_preenable much more general.")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
[ Upstream commit bfa6b18 ]

Currently, there's no guarantee that udc->driver
will be valid when using soft_connect sysfs
interface. In fact, we can very easily trigger
a NULL pointer dereference by trying to disconnect
when a gadget driver isn't loaded.

Fix this bug:

~# echo disconnect > soft_connect
[   33.685743] Unable to handle kernel NULL pointer dereference at virtual address 00000014
[   33.694221] pgd = ed0cc000
[   33.697174] [00000014] *pgd=ae351831, *pte=00000000, *ppte=00000000
[   33.703766] Internal error: Oops: 17 [#1] SMP ARM
[   33.708697] Modules linked in: xhci_plat_hcd xhci_hcd snd_soc_davinci_mcasp snd_soc_tlv320aic3x snd_soc_edma snd_soc_omap snd_soc_evm snd_soc_core dwc3 snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd lis3lv02d_i2c matrix_keypad lis3lv02d dwc3_omap input_polldev soundcore
[   33.734372] CPU: 0 PID: 1457 Comm: bash Not tainted 3.17.0-09740-ga93416e-dirty #345
[   33.742457] task: ee71ce00 ti: ee68a000 task.ti: ee68a000
[   33.748116] PC is at usb_udc_softconn_store+0xa4/0xec
[   33.753416] LR is at mark_held_locks+0x78/0x90
[   33.758057] pc : [<c04df128>]    lr : [<c00896a4>]    psr: 20000013
[   33.758057] sp : ee68bec8  ip : c0c00008  fp : ee68bee4
[   33.770050] r10: ee6b394c  r9 : ee68bf80  r8 : ee6062c0
[   33.775508] r7 : 00000000  r6 : ee6062c0  r5 : 0000000b  r4 : ee739408
[   33.782346] r3 : 00000000  r2 : 00000000  r1 : ee71d390  r0 : ee664170
[   33.789168] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   33.796636] Control: 10c5387d  Table: ad0cc059  DAC: 00000015
[   33.802638] Process bash (pid: 1457, stack limit = 0xee68a248)
[   33.808740] Stack: (0xee68bec8 to 0xee68c000)
[   33.813299] bec0:                   0000000b c0411284 ee6062c0 00000000 ee68bef4 ee68bee8
[   33.821862] bee0: c04112ac c04df090 ee68bf14 ee68bef8 c01c2868 c0411290 0000000b ee6b3940
[   33.830419] bf00: 00000000 00000000 ee68bf4c ee68bf18 c01c1a24 c01c2818 00000000 00000000
[   33.838990] bf20: ee61b940 ee2f47c0 0000000b 000ce408 ee68bf80 c000f304 ee68a000 00000000
[   33.847544] bf40: ee68bf7c ee68bf50 c0152dd8 c01c1960 ee68bf7c c0170af8 ee68bf7c ee2f47c0
[   33.856099] bf60: ee2f47c0 000ce408 0000000b c000f304 ee68bfa4 ee68bf80 c0153330 c0152d34
[   33.864653] bf80: 00000000 00000000 0000000b 000ce408 b6e7fb50 00000004 00000000 ee68bfa8
[   33.873204] bfa0: c000f080 c01532e8 0000000b 000ce408 00000001 000ce408 0000000b 00000000
[   33.881763] bfc0: 0000000b 000ce408 b6e7fb50 00000004 0000000b 00000000 000c5758 00000000
[   33.890319] bfe0: 00000000 bec2c924 b6de422d b6e1d226 40000030 00000001 75716d2f 00657565
[   33.898890] [<c04df128>] (usb_udc_softconn_store) from [<c04112ac>] (dev_attr_store+0x28/0x34)
[   33.907920] [<c04112ac>] (dev_attr_store) from [<c01c2868>] (sysfs_kf_write+0x5c/0x60)
[   33.916200] [<c01c2868>] (sysfs_kf_write) from [<c01c1a24>] (kernfs_fop_write+0xd0/0x194)
[   33.924773] [<c01c1a24>] (kernfs_fop_write) from [<c0152dd8>] (vfs_write+0xb0/0x1bc)
[   33.932874] [<c0152dd8>] (vfs_write) from [<c0153330>] (SyS_write+0x54/0xb0)
[   33.940247] [<c0153330>] (SyS_write) from [<c000f080>] (ret_fast_syscall+0x0/0x48)
[   33.948160] Code: e1a01007 e12fff33 e5140004 e5143008 (e5933014)
[   33.954625] ---[ end trace f849bead94eab7ea ]---

Fixes: 2ccea03 (usb: gadget: introduce UDC Class)
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit e4a60d1 upstream.

There is a race condition when removing glue directory.
It can be reproduced in following test:

path 1: Add first child device
device_add()
    get_device_parent()
            /*find parent from glue_dirs.list*/
            list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry)
                    if (k->parent == parent_kobj) {
                            kobj = kobject_get(k);
                            break;
                    }
            ....
            class_dir_create_and_add()

path2: Remove last child device under glue dir
device_del()
    cleanup_device_parent()
            cleanup_glue_dir()
                    kobject_put(glue_dir);

If path2 has been called cleanup_glue_dir(), but not
call kobject_put(glue_dir), the glue dir is still
in parent's kset list. Meanwhile, path1 find the glue
dir from the glue_dirs.list. Path2 may release glue dir
before path1 call kobject_get(). So kernel will report
the warning and bug_on.

This is a "classic" problem we have of a kref in a list
that can be found while the last instance could be removed
at the same time.

This patch reuse gdp_mutex to fix this race condition.

The following calltrace is captured in kernel 3.4, but
the latest kernel still has this bug.

-----------------------------------------------------
<4>[ 3965.441471] WARNING: at ...include/linux/kref.h:41 kobject_get+0x33/0x40()
<4>[ 3965.441474] Hardware name: Romley
<4>[ 3965.441475] Modules linked in: isd_iop(O) isd_xda(O)...
...
<4>[ 3965.441605] Call Trace:
<4>[ 3965.441611]  [<ffffffff8103717a>] warn_slowpath_common+0x7a/0xb0
<4>[ 3965.441615]  [<ffffffff810371c5>] warn_slowpath_null+0x15/0x20
<4>[ 3965.441618]  [<ffffffff81215963>] kobject_get+0x33/0x40
<4>[ 3965.441624]  [<ffffffff812d1e45>] get_device_parent.isra.11+0x135/0x1f0
<4>[ 3965.441627]  [<ffffffff812d22d4>] device_add+0xd4/0x6d0
<4>[ 3965.441631]  [<ffffffff812d0dbc>] ? dev_set_name+0x3c/0x40
....
<2>[ 3965.441912] kernel BUG at ..../fs/sysfs/group.c:65!
<4>[ 3965.441915] invalid opcode: 0000 [#1] SMP
...
<4>[ 3965.686743]  [<ffffffff811a677e>] sysfs_create_group+0xe/0x10
<4>[ 3965.686748]  [<ffffffff810cfb04>] blk_trace_init_sysfs+0x14/0x20
<4>[ 3965.686753]  [<ffffffff811fcabb>] blk_register_queue+0x3b/0x120
<4>[ 3965.686756]  [<ffffffff812030bc>] add_disk+0x1cc/0x490
....
-------------------------------------------------------

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Weng Meiling <wengmeiling.weng@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
…formed packet

[ Upstream commit e40607c ]

An SCTP server doing ASCONF will panic on malformed INIT ping-of-death
in the form of:

  ------------ INIT[PARAM: SET_PRIMARY_IP] ------------>

While the INIT chunk parameter verification dissects through many things
in order to detect malformed input, it misses to actually check parameters
inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary
IP address' parameter in ASCONF, which has as a subparameter an address
parameter.

So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS
or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0
and thus sctp_get_af_specific() returns NULL, too, which we then happily
dereference unconditionally through af->from_addr_param().

The trace for the log:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
IP: [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
PGD 0
Oops: 0000 [#1] SMP
[...]
Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs
RIP: 0010:[<ffffffffa01e9c62>]  [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
[...]
Call Trace:
 <IRQ>
 [<ffffffffa01f2add>] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp]
 [<ffffffffa01e1fcb>] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp]
 [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
 [<ffffffffa01e5c09>] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp]
 [<ffffffffa01e61f6>] sctp_endpoint_bh_rcv+0x116/0x230 [sctp]
 [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
 [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
 [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
 [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
 [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[...]

A minimal way to address this is to check for NULL as we do on all
other such occasions where we know sctp_get_af_specific() could
possibly return with NULL.

Fixes: d6de309 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
[ Upstream commit 7da89a2 ]

Meelis Roos reports crashes during bootup on a V480 that look like
this:

====================
[   61.300577] PCI: Scanning PBM /pci@9,600000
[   61.304867] schizo f009b070: PCI host bridge to bus 0003:00
[   61.310385] pci_bus 0003:00: root bus resource [io  0x7ffe9000000-0x7ffe9ffffff] (bus address [0x0000-0xffffff])
[   61.320515] pci_bus 0003:00: root bus resource [mem 0x7fb00000000-0x7fbffffffff] (bus address [0x00000000-0xffffffff])
[   61.331173] pci_bus 0003:00: root bus resource [bus 00]
[   61.385344] Unable to handle kernel NULL pointer dereference
[   61.390970] tsk->{mm,active_mm}->context = 0000000000000000
[   61.396515] tsk->{mm,active_mm}->pgd = fff000b000002000
[   61.401716]               \|/ ____ \|/
[   61.401716]               "@'/ .. \`@"
[   61.401716]               /_| \__/ |_\
[   61.401716]                  \__U_/
[   61.416362] swapper/0(0): Oops [#1]
[   61.419837] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc1-00422-g2cc9188-dirty #24
[   61.427975] task: fff000b0fd8e9c40 ti: fff000b0fd928000 task.ti: fff000b0fd928000
[   61.435426] TSTATE: 0000004480e01602 TPC: 00000000004455e4 TNPC: 00000000004455e8 Y: 00000000    Not tainted
[   61.445230] TPC: <schizo_pcierr_intr+0x104/0x560>
[   61.449897] g0: 0000000000000000 g1: 0000000000000000 g2: 0000000000a10f78 g3: 000000000000000a
[   61.458563] g4: fff000b0fd8e9c40 g5: fff000b0fdd82000 g6: fff000b0fd928000 g7: 000000000000000a
[   61.467229] o0: 000000000000003d o1: 0000000000000000 o2: 0000000000000006 o3: fff000b0ffa5fc7e
[   61.475894] o4: 0000000000060000 o5: c000000000000000 sp: fff000b0ffa5f3c1 ret_pc: 00000000004455cc
[   61.484909] RPC: <schizo_pcierr_intr+0xec/0x560>
[   61.489500] l0: fff000b0fd8e9c40 l1: 0000000000a20800 l2: 0000000000000000 l3: 000000000119a430
[   61.498164] l4: 0000000001742400 l5: 00000000011cfbe0 l6: 00000000011319c0 l7: fff000b0fd8ea348
[   61.506830] i0: 0000000000000000 i1: fff000b0fdb34000 i2: 0000000320000000 i3: 0000000000000000
[   61.515497] i4: 00060002010b003f i5: 0000040004e02000 i6: fff000b0ffa5f481 i7: 00000000004a9920
[   61.524175] I7: <handle_irq_event_percpu+0x40/0x140>
[   61.529099] Call Trace:
[   61.531531]  [00000000004a9920] handle_irq_event_percpu+0x40/0x140
[   61.537681]  [00000000004a9a58] handle_irq_event+0x38/0x80
[   61.543145]  [00000000004ac77c] handle_fasteoi_irq+0xbc/0x200
[   61.548860]  [00000000004a9084] generic_handle_irq+0x24/0x40
[   61.554500]  [000000000042be0c] handler_irq+0xac/0x100
====================

The problem is that pbm->pci_bus->self is NULL.

This code is trying to go through the standard PCI config space
interfaces to read the PCI controller's PCI_STATUS register.

This doesn't work, because we more often than not do not enumerate
the PCI controller as a bonafide PCI device during the OF device
node scan.  Therefore bus->self remains NULL.

Existing common code for PSYCHO and PSYCHO-like PCI controllers
handles this properly, by doing the config space access directly.

Do the same here, pbm->pci_ops->{read,write}().

Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
[ Upstream commit ab5c780 ]

Otherwise rcu_irq_{enter,exit}() do not happen and we get dumps like:

====================
[  188.275021] ===============================
[  188.309351] [ INFO: suspicious RCU usage. ]
[  188.343737] 3.18.0-rc3-00068-g20f3963-dirty #54 Not tainted
[  188.394786] -------------------------------
[  188.429170] include/linux/rcupdate.h:883 rcu_read_lock() used
illegally while idle!
[  188.505235]
other info that might help us debug this:

[  188.554230]
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
[  188.637587] RCU used illegally from extended quiescent state!
[  188.690684] 3 locks held by swapper/7/0:
[  188.721932]  #0:  (&x->wait#11){......}, at: [<0000000000495de8>] complete+0x8/0x60
[  188.797994]  #1:  (&p->pi_lock){-.-.-.}, at: [<000000000048510c>] try_to_wake_up+0xc/0x400
[  188.881343]  #2:  (rcu_read_lock){......}, at: [<000000000048a910>] select_task_rq_fair+0x90/0xb40
[  188.973043]stack backtrace:
[  188.993879] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.18.0-rc3-00068-g20f3963-dirty #54
[  189.076187] Call Trace:
[  189.089719]  [0000000000499360] lockdep_rcu_suspicious+0xe0/0x100
[  189.147035]  [000000000048a99c] select_task_rq_fair+0x11c/0xb40
[  189.202253]  [00000000004852d8] try_to_wake_up+0x1d8/0x400
[  189.252258]  [000000000048554c] default_wake_function+0xc/0x20
[  189.306435]  [0000000000495554] __wake_up_common+0x34/0x80
[  189.356448]  [00000000004955b4] __wake_up_locked+0x14/0x40
[  189.406456]  [0000000000495e08] complete+0x28/0x60
[  189.448142]  [0000000000636e28] blk_end_sync_rq+0x8/0x20
[  189.496057]  [0000000000639898] __blk_mq_end_request+0x18/0x60
[  189.550249]  [00000000006ee014] scsi_end_request+0x94/0x180
[  189.601286]  [00000000006ee334] scsi_io_completion+0x1d4/0x600
[  189.655463]  [00000000006e51c4] scsi_finish_command+0xc4/0xe0
[  189.708598]  [00000000006ed958] scsi_softirq_done+0x118/0x140
[  189.761735]  [00000000006398ec] __blk_mq_complete_request_remote+0xc/0x20
[  189.827383]  [00000000004c75d0] generic_smp_call_function_single_interrupt+0x150/0x1c0
[  189.906581]  [000000000043e514] smp_call_function_single_client+0x14/0x40
====================

Based almost entirely upon a patch by Paul E. McKenney.

Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit aaef317 upstream.

Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
tickets will have their buffers vmalloc'ed, which leads to the
following crash in crypto:

[   28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
[   28.686032] IP: [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
[   28.686032] PGD 0
[   28.688088] Oops: 0000 [#1] PREEMPT SMP
[   28.688088] Modules linked in:
[   28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
[   28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[   28.688088] Workqueue: ceph-msgr con_work
[   28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
[   28.688088] RIP: 0010:[<ffffffff81392b42>]  [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
[   28.688088] RSP: 0018:ffff8800d903f688  EFLAGS: 00010286
[   28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
[   28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
[   28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
[   28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
[   28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
[   28.688088] FS:  00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[   28.688088] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
[   28.688088] Stack:
[   28.688088]  ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
[   28.688088]  ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
[   28.688088]  ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
[   28.688088] Call Trace:
[   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
[   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
[   28.688088]  [<ffffffff81395d32>] blkcipher_walk_done+0x182/0x220
[   28.688088]  [<ffffffff813990bf>] crypto_cbc_encrypt+0x15f/0x180
[   28.688088]  [<ffffffff81399780>] ? crypto_aes_set_key+0x30/0x30
[   28.688088]  [<ffffffff8156c40c>] ceph_aes_encrypt2+0x29c/0x2e0
[   28.688088]  [<ffffffff8156d2a3>] ceph_encrypt2+0x93/0xb0
[   28.688088]  [<ffffffff8156d7da>] ceph_x_encrypt+0x4a/0x60
[   28.688088]  [<ffffffff8155b39d>] ? ceph_buffer_new+0x5d/0xf0
[   28.688088]  [<ffffffff8156e837>] ceph_x_build_authorizer.isra.6+0x297/0x360
[   28.688088]  [<ffffffff8112089b>] ? kmem_cache_alloc_trace+0x11b/0x1c0
[   28.688088]  [<ffffffff8156b496>] ? ceph_auth_create_authorizer+0x36/0x80
[   28.688088]  [<ffffffff8156ed83>] ceph_x_create_authorizer+0x63/0xd0
[   28.688088]  [<ffffffff8156b4b4>] ceph_auth_create_authorizer+0x54/0x80
[   28.688088]  [<ffffffff8155f7c0>] get_authorizer+0x80/0xd0
[   28.688088]  [<ffffffff81555a8b>] prepare_write_connect+0x18b/0x2b0
[   28.688088]  [<ffffffff81559289>] try_read+0x1e59/0x1f10

This is because we set up crypto scatterlists as if all buffers were
kmalloc'ed.  Fix it.

Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 5596b0b upstream.

[    1.904000] BUG: scheduling while atomic: swapper/1/0x00000002
[    1.908000] Modules linked in:
[    1.916000] CPU: 0 PID: 1 Comm: swapper Not tainted 3.12.0-rc2-lemote-los.git-5318619-dirty #1
[    1.920000] Stack : 0000000031aac000 ffffffff810d0000 0000000000000052 ffffffff802730a4
          0000000000000000 0000000000000001 ffffffff810cdf90 ffffffff810d0000
          ffffffff8068b968 ffffffff806f5537 ffffffff810cdf90 980000009f0782e8
          0000000000000001 ffffffff80720000 ffffffff806b0000 980000009f078000
          980000009f290000 ffffffff805f312c 980000009f05b5d8 ffffffff80233518
          980000009f05b5e8 ffffffff80274b7c 980000009f078000 ffffffff8068b968
          0000000000000000 0000000000000000 0000000000000000 0000000000000000
          0000000000000000 980000009f05b520 0000000000000000 ffffffff805f2f6c
          0000000000000000 ffffffff80700000 ffffffff80700000 ffffffff806fc758
          ffffffff80700000 ffffffff8020be98 ffffffff806fceb0 ffffffff805f2f6c
          ...
[    2.028000] Call Trace:
[    2.032000] [<ffffffff8020be98>] show_stack+0x80/0x98
[    2.036000] [<ffffffff805f2f6c>] __schedule_bug+0x44/0x6c
[    2.040000] [<ffffffff805fac58>] __schedule+0x518/0x5b0
[    2.044000] [<ffffffff805f8a58>] schedule_timeout+0x128/0x1f0
[    2.048000] [<ffffffff80240314>] msleep+0x3c/0x60
[    2.052000] [<ffffffff80495400>] do_probe+0x238/0x3a8
[    2.056000] [<ffffffff804958b0>] ide_probe_port+0x340/0x7e8
[    2.060000] [<ffffffff80496028>] ide_host_register+0x2d0/0x7a8
[    2.064000] [<ffffffff8049c65c>] ide_pci_init_two+0x4e4/0x790
[    2.068000] [<ffffffff8049f9b8>] amd74xx_probe+0x148/0x2c8
[    2.072000] [<ffffffff803f571c>] pci_device_probe+0xc4/0x130
[    2.076000] [<ffffffff80478f60>] driver_probe_device+0x98/0x270
[    2.080000] [<ffffffff80479298>] __driver_attach+0xe0/0xe8
[    2.084000] [<ffffffff80476ab0>] bus_for_each_dev+0x78/0xe0
[    2.088000] [<ffffffff80478468>] bus_add_driver+0x230/0x310
[    2.092000] [<ffffffff80479b44>] driver_register+0x84/0x158
[    2.096000] [<ffffffff80200504>] do_one_initcall+0x104/0x160

Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: linux-mips@linux-mips.org
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/5941/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Alexandre Oliva <lxoliva@fsfla.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 01a4cc4 upstream.

In some cases, the fcoe_rx_list may contains multiple instances
of the same skb (the so called "shared skbs").

the bnx2fc_l2_rcv thread is a loop that extracts a skb from the list,
modifies (and destroys) its content and then proceed to the next one.
The problem is that if the skb is shared, the remaining instances will
be corrupted.

The solution is to use skb_share_check() before adding the skb to the
fcoe_rx_list.

[ 6286.808725] ------------[ cut here ]------------
[ 6286.808729] WARNING: at include/scsi/fc_frame.h:173 bnx2fc_l2_rcv_thread+0x425/0x450 [bnx2fc]()
[ 6286.808748] Modules linked in: bnx2x(-) mdio dm_service_time bnx2fc cnic uio fcoe libfcoe 8021q garp stp mrp libfc llc scsi_transport_fc scsi_tgt sg iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel e1000e ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper ptp cryptd hpilo serio_raw hpwdt lpc_ich pps_core ipmi_si pcspkr mfd_core ipmi_msghandler shpchp pcc_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc dm_multipath xfs libcrc32c ata_generic pata_acpi sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit ata_piix drm_kms_helper ttm drm libata i2c_core hpsa dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mdio]
[ 6286.808750] CPU: 3 PID: 1304 Comm: bnx2fc_l2_threa Not tainted 3.10.0-121.el7.x86_64 #1
[ 6286.808750] Hardware name: HP ProLiant DL120 G7, BIOS J01 07/01/2013
[ 6286.808752]  0000000000000000 000000000b36e715 ffff8800deba1e00 ffffffff815ec0ba
[ 6286.808753]  ffff8800deba1e38 ffffffff8105dee1 ffffffffa05618c0 ffff8801e4c81888
[ 6286.808754]  ffffe8ffff663868 ffff8801f402b180 ffff8801f56bc000 ffff8800deba1e48
[ 6286.808754] Call Trace:
[ 6286.808759]  [<ffffffff815ec0ba>] dump_stack+0x19/0x1b
[ 6286.808762]  [<ffffffff8105dee1>] warn_slowpath_common+0x61/0x80
[ 6286.808763]  [<ffffffff8105e00a>] warn_slowpath_null+0x1a/0x20
[ 6286.808765]  [<ffffffffa054f415>] bnx2fc_l2_rcv_thread+0x425/0x450 [bnx2fc]
[ 6286.808767]  [<ffffffffa054eff0>] ? bnx2fc_disable+0x90/0x90 [bnx2fc]
[ 6286.808769]  [<ffffffff81085aef>] kthread+0xcf/0xe0
[ 6286.808770]  [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140
[ 6286.808772]  [<ffffffff815fc76c>] ret_from_fork+0x7c/0xb0
[ 6286.808773]  [<ffffffff81085a20>] ? kthread_create_on_node+0x140/0x140
[ 6286.808774] ---[ end trace c6cdb939184ccb4e ]---

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
…h 3.18.0-rc6

commit f5475cc upstream.

I was unable too boot 3.18.0-rc6 because of the following kernel
panic in drm_calc_vbltimestamp_from_scanoutpos():

    [drm] Initialized drm 1.1.0 20060810
    [drm] radeon kernel modesetting enabled.
    [drm] initializing kernel modesetting (RV100 0x1002:0x515E 0x15D9:0x8080).
    [drm] register mmio base: 0xC8400000
    [drm] register mmio size: 65536
    radeon 0000:0b:01.0: VRAM: 128M 0x00000000D0000000 - 0x00000000D7FFFFFF (16M used)
    radeon 0000:0b:01.0: GTT: 512M 0x00000000B0000000 - 0x00000000CFFFFFFF
    [drm] Detected VRAM RAM=128M, BAR=128M
    [drm] RAM width 16bits DDR
    [TTM] Zone  kernel: Available graphics memory: 3829346 kiB
    [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
    [TTM] Initializing pool allocator
    [TTM] Initializing DMA pool allocator
    [drm] radeon: 16M of VRAM memory ready
    [drm] radeon: 512M of GTT memory ready.
    [drm] GART: num cpu pages 131072, num gpu pages 131072
    [drm] PCI GART of 512M enabled (table at 0x0000000037880000).
    radeon 0000:0b:01.0: WB disabled
    radeon 0000:0b:01.0: fence driver on ring 0 use gpu addr 0x00000000b0000000 and cpu addr 0xffff8800bbbfa000
    [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
    [drm] Driver supports precise vblank timestamp query.
    [drm] radeon: irq initialized.
    [drm] Loading R100 Microcode
    radeon 0000:0b:01.0: Direct firmware load for radeon/R100_cp.bin failed with error -2
    radeon_cp: Failed to load firmware "radeon/R100_cp.bin"
    [drm:r100_cp_init] *ERROR* Failed to load firmware!
    radeon 0000:0b:01.0: failed initializing CP (-2).
    radeon 0000:0b:01.0: Disabling GPU acceleration
    [drm] radeon: cp finalized
    BUG: unable to handle kernel NULL pointer dereference at 000000000000025c
    IP: [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in:
    CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-4-default #2649
    Hardware name: Supermicro X7DB8/X7DB8, BIOS 6.00 07/26/2006
    task: ffff880234da2010 ti: ffff880234da4000 task.ti: ffff880234da4000
    RIP: 0010:[<ffffffff8150423b>]  [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
    RSP: 0000:ffff880234da7918  EFLAGS: 00010086
    RAX: ffffffff81557890 RBX: 0000000000000000 RCX: ffff880234da7a48
    RDX: ffff880234da79f4 RSI: 0000000000000000 RDI: ffff880232e15000
    RBP: ffff880234da79b8 R08: 0000000000000000 R09: 0000000000000000
    R10: 000000000000000a R11: 0000000000000001 R12: ffff880232dda1c0
    R13: ffff880232e1518c R14: 0000000000000292 R15: ffff880232e15000
    FS:  0000000000000000(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 000000000000025c CR3: 0000000002014000 CR4: 00000000000007e0
    Stack:
     ffff880234da79d8 0000000000000286 ffff880232dcbc00 0000000000002480
     ffff880234da7958 0000000000000296 ffff880234da7998 ffffffff8151b51d
     ffff880234da7a48 0000000032dcbeb0 ffff880232dcbc00 ffff880232dcbc58
    Call Trace:
     [<ffffffff8151b51d>] ? drm_vma_offset_remove+0x1d/0x110
     [<ffffffff8152dc98>] radeon_get_vblank_timestamp_kms+0x38/0x60
     [<ffffffff8152076a>] ? ttm_bo_release_list+0xba/0x180
     [<ffffffff81503751>] drm_get_last_vbltimestamp+0x41/0x70
     [<ffffffff81503933>] vblank_disable_and_save+0x73/0x1d0
     [<ffffffff81106b2f>] ? try_to_del_timer_sync+0x4f/0x70
     [<ffffffff81505245>] drm_vblank_cleanup+0x65/0xa0
     [<ffffffff815604fa>] radeon_irq_kms_fini+0x1a/0x70
     [<ffffffff8156c07e>] r100_init+0x26e/0x410
     [<ffffffff8152ae3e>] radeon_device_init+0x7ae/0xb50
     [<ffffffff8152d57f>] radeon_driver_load_kms+0x8f/0x210
     [<ffffffff81506965>] drm_dev_register+0xb5/0x110
     [<ffffffff8150998f>] drm_get_pci_dev+0x8f/0x200
     [<ffffffff815291cd>] radeon_pci_probe+0xad/0xe0
     [<ffffffff8141a365>] local_pci_probe+0x45/0xa0
     [<ffffffff8141b741>] pci_device_probe+0xd1/0x130
     [<ffffffff81633dad>] driver_probe_device+0x12d/0x3e0
     [<ffffffff8163413b>] __driver_attach+0x9b/0xa0
     [<ffffffff816340a0>] ? __device_attach+0x40/0x40
     [<ffffffff81631cd3>] bus_for_each_dev+0x63/0xa0
     [<ffffffff8163378e>] driver_attach+0x1e/0x20
     [<ffffffff81633390>] bus_add_driver+0x180/0x240
     [<ffffffff81634914>] driver_register+0x64/0xf0
     [<ffffffff81419cac>] __pci_register_driver+0x4c/0x50
     [<ffffffff81509bf5>] drm_pci_init+0xf5/0x120
     [<ffffffff821dc871>] ? ttm_init+0x6a/0x6a
     [<ffffffff821dc908>] radeon_init+0x97/0xb5
     [<ffffffff810002fc>] do_one_initcall+0xbc/0x1f0
     [<ffffffff810e3278>] ? __wake_up+0x48/0x60
     [<ffffffff8218e256>] kernel_init_freeable+0x18a/0x215
     [<ffffffff8218d983>] ? initcall_blacklist+0xc0/0xc0
     [<ffffffff818a78f0>] ? rest_init+0x80/0x80
     [<ffffffff818a78fe>] kernel_init+0xe/0xf0
     [<ffffffff818c0c3c>] ret_from_fork+0x7c/0xb0
     [<ffffffff818a78f0>] ? rest_init+0x80/0x80
    Code: 45 ac 0f 88 a8 01 00 00 3b b7 d0 01 00 00 49 89 ff 0f 83 99 01 00 00 48 8b 47 20 48 8b 80 88 00 00 00 48 85 c0 0f 84 cd 01 00 00 <41> 8b b1 5c 02 00 00 41 8b 89 58 02 00 00 89 75 98 41 8b b1 60
    RIP  [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
     RSP <ffff880234da7918>
    CR2: 000000000000025c
    ---[ end trace ad2c0aadf48e2032 ]---
    Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

It has helped me to add a NULL pointer check that was suggested at
http://lists.freedesktop.org/archives/dri-devel/2014-October/070663.html

I am not familiar with the code. But the change looks sane
and we need something fast at this stage of 3.18 development.

Suggested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 66139a48cee1530c91f37c145384b4ee7043f0b7 upstream.

In snd_usbmidi_error_timer(), the driver tries to resubmit MIDI input
URBs to reactivate the MIDI stream, but this causes the error when
some of URBs are still pending like:

 WARNING: CPU: 0 PID: 0 at ../drivers/usb/core/urb.c:339 usb_submit_urb+0x5f/0x70()
 URB ef705c40 submitted while active
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.6-2-desktop #1
 Hardware name: FOXCONN TPS01/TPS01, BIOS 080015  03/23/2010
  c0984bfa f4009ed4 c078deaf f4009ee4 c024c884 c09a135c f4009f00 00000000
  c0984bfa 00000153 c061ac4f c061ac4f 00000009 00000001 ef705c40 e854d1c0
  f4009eec c024c8d3 00000009 f4009ee4 c09a135c f4009f00 f4009f04 c061ac4f
 Call Trace:
  [<c0205df6>] try_stack_unwind+0x156/0x170
  [<c020482a>] dump_trace+0x5a/0x1b0
  [<c0205e56>] show_trace_log_lvl+0x46/0x50
  [<c02049d1>] show_stack_log_lvl+0x51/0xe0
  [<c0205eb7>] show_stack+0x27/0x50
  [<c078deaf>] dump_stack+0x45/0x65
  [<c024c884>] warn_slowpath_common+0x84/0xa0
  [<c024c8d3>] warn_slowpath_fmt+0x33/0x40
  [<c061ac4f>] usb_submit_urb+0x5f/0x70
  [<f7974104>] snd_usbmidi_submit_urb+0x14/0x60 [snd_usbmidi_lib]
  [<f797483a>] snd_usbmidi_error_timer+0x6a/0xa0 [snd_usbmidi_lib]
  [<c02570c0>] call_timer_fn+0x30/0x130
  [<c0257442>] run_timer_softirq+0x1c2/0x260
  [<c0251493>] __do_softirq+0xc3/0x270
  [<c0204732>] do_softirq_own_stack+0x22/0x30
  [<c025186d>] irq_exit+0x8d/0xa0
  [<c0795228>] smp_apic_timer_interrupt+0x38/0x50
  [<c0794a3c>] apic_timer_interrupt+0x34/0x3c
  [<c0673d9e>] cpuidle_enter_state+0x3e/0xd0
  [<c028bb8d>] cpu_idle_loop+0x29d/0x3e0
  [<c028bd23>] cpu_startup_entry+0x53/0x60
  [<c0bfac1e>] start_kernel+0x415/0x41a

For avoiding these errors, check the pending URBs and skip
resubmitting such ones.

Reported-and-tested-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
Under certain loads, this soft lockup has been observed:

   BUG: soft lockup - CPU#2 stuck for 22s! [ip6tables:1016]
   Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211 rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vfat fat efivarfs xfs libcrc32c

   CPU: 2 PID: 1016 Comm: ip6tables Not tainted 3.13.0-0.rc7.30.sa2.aarch64 #1
   task: fffffe03e81d1400 ti: fffffe03f01f8000 task.ti: fffffe03f01f8000
   PC is at __cpu_flush_kern_tlb_range+0xc/0x40
   LR is at __purge_vmap_area_lazy+0x28c/0x3ac
   pc : [<fffffe000009c5cc>] lr : [<fffffe0000182710>] pstate: 80000145
   sp : fffffe03f01fbb70
   x29: fffffe03f01fbb70 x28: fffffe03f01f8000
   x27: fffffe0000b19000 x26: 00000000000000d0
   x25: 000000000000001c x24: fffffe03f01fbc50
   x23: fffffe03f01fbc58 x22: fffffe03f01fbc10
   x21: fffffe0000b2a3f8 x20: 0000000000000802
   x19: fffffe0000b2a3c8 x18: 000003fffdf52710
   x17: 000003ff9d8bb910 x16: fffffe000050fbfc
   x15: 0000000000005735 x14: 000003ff9d7e1a5c
   x13: 0000000000000000 x12: 000003ff9d7e1a5c
   x11: 0000000000000007 x10: fffffe0000c09af0
   x9 : fffffe0000ad1000 x8 : 000000000000005c
   x7 : fffffe03e8624000 x6 : 0000000000000000
   x5 : 0000000000000000 x4 : 0000000000000000
   x3 : fffffe0000c09cc8 x2 : 0000000000000000
   x1 : 000fffffdfffca80 x0 : 000fffffcd742150

The __cpu_flush_kern_tlb_range() function looks like:

  ENTRY(__cpu_flush_kern_tlb_range)
	dsb	sy
	lsr	x0, x0, #12
	lsr	x1, x1, #12
  1:	tlbi	vaae1is, x0
	add	x0, x0, #1
	cmp	x0, x1
	b.lo	1b
	dsb	sy
	isb
	ret
  ENDPROC(__cpu_flush_kern_tlb_range)

The above soft lockup shows the PC at tlbi insn with:

  x0 = 0x000fffffcd742150
  x1 = 0x000fffffdfffca80

So __cpu_flush_kern_tlb_range has 0x128ba930 tlbi flushes left
after it has already been looping for 23 seconds!.

Looking up one frame at __purge_vmap_area_lazy(), there is:

	...
	list_for_each_entry_rcu(va, &vmap_area_list, list) {
		if (va->flags & VM_LAZY_FREE) {
			if (va->va_start < *start)
				*start = va->va_start;
			if (va->va_end > *end)
				*end = va->va_end;
			nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
			list_add_tail(&va->purge_list, &valist);
			va->flags |= VM_LAZY_FREEING;
			va->flags &= ~VM_LAZY_FREE;
		}
	}
	...
	if (nr || force_flush)
		flush_tlb_kernel_range(*start, *end);

So if two areas are being freed, the range passed to
flush_tlb_kernel_range() may be as large as the vmalloc
space. For arm64, this is ~240GB for 4k pagesize and ~2TB
for 64kpage size.

This patch works around this problem by adding a loop limit.
If the range is larger than the limit, use flush_tlb_all()
rather than flushing based on individual pages. The limit
chosen is arbitrary as the TLB size is implementation
specific and not accessible in an architected way. The aim
of the arbitrary limit is to avoid soft lockup.

Signed-off-by: Mark Salter <msalter@redhat.com>
[catalin.marinas@arm.com: commit log update]
[catalin.marinas@arm.com: marginal optimisation]
[catalin.marinas@arm.com: changed to MAX_TLB_RANGE and added comment]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

(cherry picked from commit 05ac653)
Signed-off-by: Mark Brown <broonie@kernel.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
[ Upstream commit 17e96834fd35997ca7cdfbf15413bcd5a36ad448 ]

Hardware always provides compliment of IP pseudo checksum. Stack expects
whole packet checksum without pseudo checksum if CHECKSUM_COMPLETE is set.

This causes checksum error in nf & ovs.

kernel: qg-19546f09-f2: hw csum failure
kernel: CPU: 9 PID: 0 Comm: swapper/9 Tainted: GF          O--------------   3.10.0-123.8.1.el7.x86_64 #1
kernel: Hardware name: Cisco Systems Inc UCSB-B200-M3/UCSB-B200-M3, BIOS B200M3.2.2.3.0.080820141339 08/08/2014
kernel: ffff881218f40000 df68243feb35e3a8 ffff881237a43ab8 ffffffff815e237b
kernel: ffff881237a43ad0 ffffffff814cd4ca ffff8829ec71eb00 ffff881237a43af0
kernel: ffffffff814c6232 0000000000000286 ffff8829ec71eb00 ffff881237a43b00
kernel: Call Trace:
kernel: <IRQ>  [<ffffffff815e237b>] dump_stack+0x19/0x1b
kernel: [<ffffffff814cd4ca>] netdev_rx_csum_fault+0x3a/0x40
kernel: [<ffffffff814c6232>] __skb_checksum_complete_head+0x62/0x70
kernel: [<ffffffff814c6251>] __skb_checksum_complete+0x11/0x20
kernel: [<ffffffff8155a20c>] nf_ip_checksum+0xcc/0x100
kernel: [<ffffffffa049edc7>] icmp_error+0x1f7/0x35c [nf_conntrack_ipv4]
kernel: [<ffffffff814cf419>] ? netif_rx+0xb9/0x1d0
kernel: [<ffffffffa040eb7b>] ? internal_dev_recv+0xdb/0x130 [openvswitch]
kernel: [<ffffffffa04c8330>] nf_conntrack_in+0xf0/0xa80 [nf_conntrack]
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffffa049e302>] ipv4_conntrack_in+0x22/0x30 [nf_conntrack_ipv4]
kernel: [<ffffffff815005ca>] nf_iterate+0xaa/0xc0
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffff81500664>] nf_hook_slow+0x84/0x140
kernel: [<ffffffff81509380>] ? inet_del_offload+0x40/0x40
kernel: [<ffffffff81509dd4>] ip_rcv+0x344/0x380

Hardware verifies IP & tcp/udp header checksum but does not provide payload
checksum, use CHECKSUM_UNNECESSARY. Set it only if its valid IP tcp/udp packet.

Cc: Jiri Benc <jbenc@redhat.com>
Cc: Stefan Assmann <sassmann@redhat.com>
Reported-by: Sunil Choudhary <schoudha@redhat.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 2279948735609d0d17d7384e776b674619f792ef upstream.

This patches fixes an ancient bug in the dvb_usb_af9005 driver, which
has been reported at least in the following threads:
https://lkml.org/lkml/2009/2/4/350
https://lkml.org/lkml/2014/9/18/558

If the driver is compiled in without any IR support (neither
DVB_USB_AF9005_REMOTE nor custom symbols), the symbol_request calls in
af9005_usb_module_init() return pointers != NULL although the IR
symbols are not available.

This leads to the following oops:
...
[    8.529751] usbcore: registered new interface driver dvb_usb_af9005
[    8.531584] BUG: unable to handle kernel paging request at 02e00000
[    8.533385] IP: [<7d9d67c6>] af9005_usb_module_init+0x6b/0x9d
[    8.535613] *pde = 00000000
[    8.536416] Oops: 0000 [#1] PREEMPT PREEMPT DEBUG_PAGEALLOCDEBUG_PAGEALLOC
[    8.537863] CPU: 0 PID: 1 Comm: swapper Not tainted 3.15.0-rc6-00151-ga5c075c #1
[    8.539827] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    8.541519] task: 89c9a670 ti: 89c9c000 task.ti: 89c9c000
[    8.541519] EIP: 0060:[<7d9d67c6>] EFLAGS: 00010206 CPU: 0
[    8.541519] EIP is at af9005_usb_module_init+0x6b/0x9d
[    8.541519] EAX: 02e00000 EBX: 00000000 ECX: 00000006 EDX: 00000000
[    8.541519] ESI: 00000000 EDI: 7da33ec8 EBP: 89c9df30 ESP: 89c9df2c
[    8.541519]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[    8.541519] CR0: 8005003b CR2: 02e00000 CR3: 05a54000 CR4: 00000690
[    8.541519] Stack:
[    8.541519]  7d9d675b 89c9df90 7d992a49 7d7d5914 89c9df4c 7be3a800 7d08c58c 8a4c3968
[    8.541519]  89c9df80 7be3a966 00000192 00000006 00000006 7d7d3ff4 8a4c397a 00000200
[    8.541519]  7d6b1280 8a4c3979 00000006 000009a6 7da32db8 b13eec81 00000006 000009a6
[    8.541519] Call Trace:
[    8.541519]  [<7d9d675b>] ? ttusb2_driver_init+0x16/0x16
[    8.541519]  [<7d992a49>] do_one_initcall+0x77/0x106
[    8.541519]  [<7be3a800>] ? parameqn+0x2/0x35
[    8.541519]  [<7be3a966>] ? parse_args+0x113/0x25c
[    8.541519]  [<7d992bc2>] kernel_init_freeable+0xea/0x167
[    8.541519]  [<7cf01070>] kernel_init+0x8/0xb8
[    8.541519]  [<7cf27ec0>] ret_from_kernel_thread+0x20/0x30
[    8.541519]  [<7cf01068>] ? rest_init+0x10c/0x10c
[    8.541519] Code: 08 c2 c7 05 44 ed f9 7d 00 00 e0 02 c7 05 40 ed f9 7d 00 00 e0 02 c7 05 3c ed f9 7d 00 00 e0 02 75 1f b8 00 00 e0 02 85 c0 74 16 <a1> 00 00 e0 02 c7 05 54 84 8e 7d 00 00 e0 02 a3 58 84 8e 7d eb
[    8.541519] EIP: [<7d9d67c6>] af9005_usb_module_init+0x6b/0x9d SS:ESP 0068:89c9df2c
[    8.541519] CR2: 0000000002e00000
[    8.541519] ---[ end trace 768b6faf51370fc7 ]---

The prefered fix would be to convert the whole IR code to use the kernel IR
infrastructure (which wasn't available at the time this driver had been created).

Until anyone who still has this old hardware steps up an does the conversion,
fix it by not calling the symbol_request calls if the driver is compiled in
without the default IR symbols (CONFIG_DVB_USB_AF9005_REMOTE).
Due to the IR related pointers beeing NULL by default, IR support will then be disabled.

The downside of this solution is, that it will no longer be possible to
compile custom IR symbols (not using CONFIG_DVB_USB_AF9005_REMOTE) in.

Please note that this patch has NOT been tested with all possible cases.
I don't have the hardware and could only verify that it fixes the reported
bug.

Reported-by: Fengguag Wu <fengguang.wu@intel.com>
Signed-off-by: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Luca Olivetti <luca@ventoso.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 2228d80dd05a4fc5a410fde847677b8fb3eb23d7 upstream.

We've got a bug report at disconnecting a Webcam, where the kernel
spews warnings like below:
  WARNING: CPU: 0 PID: 8385 at ../fs/sysfs/group.c:219 sysfs_remove_group+0x87/0x90()
  sysfs group c0b2350c not found for kobject 'event3'
  CPU: 0 PID: 8385 Comm: queue2:src Not tainted 3.16.2-1.gdcee397-default #1
  Hardware name: ASUSTeK Computer INC. A7N8X-E/A7N8X-E, BIOS ASUS A7N8X-E Deluxe ACPI BIOS Rev 1013  11/12/2004
    c08d0705 ddc75cbc c0718c5b ddc75ccc c024b654 c08c6d44 ddc75ce8 000020c1
    c08d0705 000000db c03d1ec7 c03d1ec7 00000009 00000000 c0b2350c d62c9064
    ddc75cd4 c024b6a3 00000009 ddc75ccc c08c6d44 ddc75ce8 ddc75cfc c03d1ec7
  Call Trace:
    [<c0205ba6>] try_stack_unwind+0x156/0x170
    [<c02046f3>] dump_trace+0x53/0x180
    [<c0205c06>] show_trace_log_lvl+0x46/0x50
    [<c0204871>] show_stack_log_lvl+0x51/0xe0
    [<c0205c67>] show_stack+0x27/0x50
    [<c0718c5b>] dump_stack+0x3e/0x4e
    [<c024b654>] warn_slowpath_common+0x84/0xa0
    [<c024b6a3>] warn_slowpath_fmt+0x33/0x40
    [<c03d1ec7>] sysfs_remove_group+0x87/0x90
    [<c05a2c54>] device_del+0x34/0x180
    [<c05e3989>] evdev_disconnect+0x19/0x50
    [<c05e06fa>] __input_unregister_device+0x9a/0x140
    [<c05e0845>] input_unregister_device+0x45/0x80
    [<f854b1d6>] uvc_delete+0x26/0x110 [uvcvideo]
    [<f84d66f8>] v4l2_device_release+0x98/0xc0 [videodev]
    [<c05a25bb>] device_release+0x2b/0x90
    [<c04ad8bf>] kobject_cleanup+0x6f/0x1a0
    [<f84d5453>] v4l2_release+0x43/0x70 [videodev]
    [<c0372f31>] __fput+0xb1/0x1b0
    [<c02650c1>] task_work_run+0x91/0xb0
    [<c024d845>] do_exit+0x265/0x910
    [<c024df64>] do_group_exit+0x34/0xa0
    [<c025a76f>] get_signal_to_deliver+0x17f/0x590
    [<c0201b6a>] do_signal+0x3a/0x960
    [<c02024f7>] do_notify_resume+0x67/0x90
    [<c071ebb5>] work_notifysig+0x30/0x3b
    [<b7739e60>] 0xb7739e5f
   ---[ end trace b1e56095a485b631 ]---

The cause is that uvc_status_cleanup() is called after usb_put_*() in
uvc_delete().  usb_put_*() removes the sysfs parent and eventually
removes the children recursively, so the later device_del() can't find
its sysfs.  The fix is simply rearrange the call orders in
uvc_delete() so that the child is removed before the parent.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=897736
Reported-and-tested-by: Martin Pluskal <mpluskal@suse.com>

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit db93facfb0ef542aa5d8079e47580b3e669a4d82 upstream.

This patch is to fix two deadlock cases.
Deadlock 1:
CPU #1
 pinctrl_register-> pinctrl_get ->
 create_pinctrl
 (Holding lock pinctrl_maps_mutex)
 -> get_pinctrl_dev_from_devname
 (Trying to acquire lock pinctrldev_list_mutex)
CPU #0
 pinctrl_unregister
 (Holding lock pinctrldev_list_mutex)
 -> pinctrl_put ->> pinctrl_free ->
 pinctrl_dt_free_maps -> pinctrl_unregister_map
 (Trying to acquire lock pinctrl_maps_mutex)

Simply to say
CPU#1 is holding lock A and trying to acquire lock B,
CPU#0 is holding lock B and trying to acquire lock A.

Deadlock 2:
CPU #3
 pinctrl_register-> pinctrl_get ->
 create_pinctrl
 (Holding lock pinctrl_maps_mutex)
 -> get_pinctrl_dev_from_devname
 (Trying to acquire lock pinctrldev_list_mutex)
CPU #2
 pinctrl_unregister
 (Holding lock pctldev->mutex)
 -> pinctrl_put ->> pinctrl_free ->
 pinctrl_dt_free_maps -> pinctrl_unregister_map
 (Trying to acquire lock pinctrl_maps_mutex)
CPU #0
 tegra_gpio_request
 (Holding lock pinctrldev_list_mutex)
 -> pinctrl_get_device_gpio_range
 (Trying to acquire lock pctldev->mutex)

Simply to say
CPU#3 is holding lock A and trying to acquire lock D,
CPU#2 is holding lock B and trying to acquire lock A,
CPU#0 is holding lock D and trying to acquire lock B.

Signed-off-by: Jim Lin <jilin@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit ce7514526742c0898b837d4395f515b79dfb5a12 upstream.

It is possible for ata_sff_flush_pio_task() to set ap->hsm_task_state to
HSM_ST_IDLE in between the time __ata_sff_port_intr() checks for HSM_ST_IDLE
and before it calls ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG().

This problem is hard to reproduce making this patch hard to verify, but this
fix will prevent the race.

I have not been able to reproduce the problem, but here is a crash dump from
a 2.6.32 kernel.

On examining the ata port's state, its hsm_task_state field has a value of HSM_ST_IDLE:

crash> struct ata_port.hsm_task_state ffff881c1121c000
  hsm_task_state = 0

Normally, this should not be possible as ata_sff_hsm_move() was called from ata_sff_host_intr(),
which checks hsm_task_state and won't call ata_sff_hsm_move() if it has a HSM_ST_IDLE value.

PID: 11053  TASK: ffff8816e846cae0  CPU: 0   COMMAND: "sshd"
 #0 [ffff88008ba03960] machine_kexec at ffffffff81038f3b
 #1 [ffff88008ba039c0] crash_kexec at ffffffff810c5d92
 #2 [ffff88008ba03a90] oops_end at ffffffff8152b510
 #3 [ffff88008ba03ac0] die at ffffffff81010e0b
 #4 [ffff88008ba03af0] do_trap at ffffffff8152ad74
 #5 [ffff88008ba03b50] do_invalid_op at ffffffff8100cf95
 #6 [ffff88008ba03bf0] invalid_op at ffffffff8100bf9b
    [exception RIP: ata_sff_hsm_move+317]
    RIP: ffffffff813a77ad  RSP: ffff88008ba03ca0  RFLAGS: 00010097
    RAX: 0000000000000000  RBX: ffff881c1121dc60  RCX: 0000000000000000
    RDX: ffff881c1121dd10  RSI: ffff881c1121dc60  RDI: ffff881c1121c000
    RBP: ffff88008ba03d00   R8: 0000000000000000   R9: 000000000000002e
    R10: 000000000001003f  R11: 000000000000009b  R12: ffff881c1121c000
    R13: 0000000000000000  R14: 0000000000000050  R15: ffff881c1121dd78
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff88008ba03d08] ata_sff_host_intr at ffffffff813a7fbd
 #8 [ffff88008ba03d38] ata_sff_interrupt at ffffffff813a821e
 #9 [ffff88008ba03d78] handle_IRQ_event at ffffffff810e6ec0
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit c957e8f084e0d21febcd6b8a0ea9631eccc92f36 upstream.

Once the current message is finished, the driver notifies SPI core about
this by calling spi_finalize_current_message(). This function queues next
message to be transferred. If there are more messages in the queue, it is
possible that the driver is asked to transfer the next message at this
point.

When spi_finalize_current_message() returns the driver clears the
drv_data->cur_chip pointer to NULL. The problem is that if the driver
already started the next message clearing drv_data->cur_chip will cause
NULL pointer dereference which crashes the kernel like:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
 IP: [<ffffffffa0022bc8>] cs_deassert+0x18/0x70 [spi_pxa2xx_platform]
 PGD 78bb8067 PUD 37712067 PMD 0
 Oops: 0000 [#1] SMP
 Modules linked in:
 CPU: 1 PID: 11 Comm: ksoftirqd/1 Tainted: G           O   3.18.0-rc4-mjo #5
 Hardware name: Intel Corp. VALLEYVIEW B3 PLATFORM/NOTEBOOK, BIOS MNW2CRB1.X64.0071.R30.1408131301 08/13/2014
 task: ffff880077f9f290 ti: ffff88007a820000 task.ti: ffff88007a820000
 RIP: 0010:[<ffffffffa0022bc8>]  [<ffffffffa0022bc8>] cs_deassert+0x18/0x70 [spi_pxa2xx_platform]
 RSP: 0018:ffff88007a823d08  EFLAGS: 00010202
 RAX: 0000000000000008 RBX: ffff8800379a4430 RCX: 0000000000000026
 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff8800379a4430
 RBP: ffff88007a823d18 R08: 00000000ffffffff R09: 000000007a9bc65a
 R10: 000000000000028f R11: 0000000000000005 R12: ffff880070123e98
 R13: ffff880070123de8 R14: 0000000000000100 R15: ffffc90004888000
 FS:  0000000000000000(0000) GS:ffff880079a80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000048 CR3: 000000007029b000 CR4: 00000000001007e0
 Stack:
  ffff88007a823d58 ffff8800379a4430 ffff88007a823d48 ffffffffa0022c89
  0000000000000000 ffff8800379a4430 0000000000000000 0000000000000006
  ffff88007a823da8 ffffffffa0023be0 ffff88007a823dd8 ffffffff81076204
 Call Trace:
  [<ffffffffa0022c89>] giveback+0x69/0xa0 [spi_pxa2xx_platform]
  [<ffffffffa0023be0>] pump_transfers+0x710/0x740 [spi_pxa2xx_platform]
  [<ffffffff81076204>] ? pick_next_task_fair+0x744/0x830
  [<ffffffff81049679>] tasklet_action+0xa9/0xe0
  [<ffffffff81049a0e>] __do_softirq+0xee/0x280
  [<ffffffff81049bc0>] run_ksoftirqd+0x20/0x40
  [<ffffffff810646df>] smpboot_thread_fn+0xff/0x1b0
  [<ffffffff810645e0>] ? SyS_setgroups+0x150/0x150
  [<ffffffff81060f9d>] kthread+0xcd/0xf0
  [<ffffffff81060ed0>] ? kthread_create_on_node+0x180/0x180
  [<ffffffff8187a82c>] ret_from_fork+0x7c/0xb0

Fix this by clearing drv_data->cur_chip before we call spi_finalize_current_message().

Reported-by: Martin Oldfield <m@mjoldfield.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit a41537e upstream.

O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
twice inside ext4_file_write_iter() and __generic_file_write() which
result in BUG_ON inside ext4_direct_IO.

Let's initialize iocb->private unconditionally.

TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/

#TYPICAL STACK TRACE:
kernel BUG at fs/ext4/inode.c:2960!
invalid opcode: 0000 [#1] SMP
Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
Stack:
 ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
 0000000000000200 0000000000000200 0000000000000001 0000000000000200
 ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
Call Trace:
 [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
 [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
 [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
 [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
 [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
 [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
 [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
 [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
 [<ffffffff811be030>] SyS_io_submit+0x10/0x20
 [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
 RSP <ffff88080f90bb58>

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
[hujianyang: Backported to 3.10
 - Move initialization of iocb->private to ext4_file_write() as we don't
   have ext4_file_write_iter(), which is introduced by commit 9b88416.
 - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
   as ext4_file_dio_write() is not move into ext4_file_write()]
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
[ Upstream commit 600ddd6825543962fb807884169e57b580dba208 ]

When hitting an INIT collision case during the 4WHS with AUTH enabled, as
already described in detail in commit 1be9a95 ("net: sctp: inherit
auth_capable on INIT collisions"), it can happen that we occasionally
still remotely trigger the following panic on server side which seems to
have been uncovered after the fix from commit 1be9a95 ...

[  533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff
[  533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230
[  533.940559] PGD 5030f2067 PUD 0
[  533.957104] Oops: 0000 [#1] SMP
[  533.974283] Modules linked in: sctp mlx4_en [...]
[  534.939704] Call Trace:
[  534.951833]  [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0
[  534.984213]  [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0
[  535.015025]  [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170
[  535.045661]  [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0
[  535.074593]  [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50
[  535.105239]  [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp]
[  535.138606]  [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0
[  535.166848]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

... or depending on the the application, for example this one:

[ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff
[ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0
[ 1370.054568] PGD 633c94067 PUD 0
[ 1370.070446] Oops: 0000 [#1] SMP
[ 1370.085010] Modules linked in: sctp kvm_amd kvm [...]
[ 1370.963431] Call Trace:
[ 1370.974632]  [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960
[ 1371.000863]  [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960
[ 1371.027154]  [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170
[ 1371.054679]  [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130
[ 1371.080183]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

With slab debugging enabled, we can see that the poison has been overwritten:

[  669.826368] BUG kmalloc-128 (Tainted: G        W     ): Poison overwritten
[  669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b
[  669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494
[  669.826424]  __slab_alloc+0x4bf/0x566
[  669.826433]  __kmalloc+0x280/0x310
[  669.826453]  sctp_auth_create_key+0x23/0x50 [sctp]
[  669.826471]  sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp]
[  669.826488]  sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp]
[  669.826505]  sctp_do_sm+0x29d/0x17c0 [sctp] [...]
[  669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494
[  669.826635]  __slab_free+0x39/0x2a8
[  669.826643]  kfree+0x1d6/0x230
[  669.826650]  kzfree+0x31/0x40
[  669.826666]  sctp_auth_key_put+0x19/0x20 [sctp]
[  669.826681]  sctp_assoc_update+0x1ee/0x2d0 [sctp]
[  669.826695]  sctp_do_sm+0x674/0x17c0 [sctp]

Since this only triggers in some collision-cases with AUTH, the problem at
heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice
when having refcnt 1, once directly in sctp_assoc_update() and yet again
from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on
the already kzfree'd memory, which is also consistent with the observation
of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected
at a later point in time when poison is checked on new allocation).

Reference counting of auth keys revisited:

Shared keys for AUTH chunks are being stored in endpoints and associations
in endpoint_shared_keys list. On endpoint creation, a null key is being
added; on association creation, all endpoint shared keys are being cached
and thus cloned over to the association. struct sctp_shared_key only holds
a pointer to the actual key bytes, that is, struct sctp_auth_bytes which
keeps track of users internally through refcounting. Naturally, on assoc
or enpoint destruction, sctp_shared_key are being destroyed directly and
the reference on sctp_auth_bytes dropped.

User space can add keys to either list via setsockopt(2) through struct
sctp_authkey and by passing that to sctp_auth_set_key() which replaces or
adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes
with refcount 1 and in case of replacement drops the reference on the old
sctp_auth_bytes. A key can be set active from user space through setsockopt()
on the id via sctp_auth_set_active_key(), which iterates through either
endpoint_shared_keys and in case of an assoc, invokes (one of various places)
sctp_auth_asoc_init_active_key().

sctp_auth_asoc_init_active_key() computes the actual secret from local's
and peer's random, hmac and shared key parameters and returns a new key
directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops
the reference if there was a previous one. The secret, which where we
eventually double drop the ref comes from sctp_auth_asoc_set_secret() with
intitial refcount of 1, which also stays unchanged eventually in
sctp_assoc_update(). This key is later being used for crypto layer to
set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac().

To close the loop: asoc->asoc_shared_key is freshly allocated secret
material and independant of the sctp_shared_key management keeping track
of only shared keys in endpoints and assocs. Hence, also commit 4184b2a
("net: sctp: fix memory leak in auth key management") is independant of
this bug here since it concerns a different layer (though same structures
being used eventually). asoc->asoc_shared_key is reference dropped correctly
on assoc destruction in sctp_association_free() and when active keys are
being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount
of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is
to remove that sctp_auth_key_put() from there which fixes these panics.

Fixes: 730fc3d ("[SCTP]: Implete SCTP-AUTH parameter processing")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
When founding a component that has orphan connections, we should
validate if it match the newly added device. If it does not match,
only then should the @still_orphan flag should be set.

The tested result as follows.
pre:
/sys/bus/coresight/devices # echo 1 > e3c42000.etb/enable_sink
/sys/bus/coresight/devices # echo 1 > e3c7c000.ptm/enable_source
[   15.527692] Unable to handle kernel NULL pointer dereference at virtual address 00000124
[   15.555142] pgd = c2294000
[   15.564226] [00000124] *pgd=3d393831, *pte=00000000, *ppte=00000000
[   15.585391] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
[   15.603807] CPU: 0 PID: 144 Comm: sh Not tainted 3.17.0-rc1-12634-g1222fe0-dirty #3
[   15.629490] task: ed3803c0 ti: c213a000 task.ti: c213a000
[   15.647627] PC is at coresight_build_paths+0x1c/0x314
[   15.664579] LR is at coresight_build_paths+0x6c/0x314
[   15.681526] pc : [<c02da20c>]    lr : [<c02da25c>]    psr: 20000013
[   15.681526] sp : c213be88  ip : c02da800  fp : 00000000
[   15.720023] r10: 00000002  r9 : ed13250c  r8 : 00000001
[   15.737549] r7 : c213bee8  r6 : ffffffea  r5 : 00000000  r4 : 00000124
[   15.759446] r3 : ed216f24  r2 : 00000001  r1 : c213bee8  r0 : 00000000
[   15.781346] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user

post:
/sys/bus/coresight/devices # echo 1 > e3c42000.etb/enable_sink
/sys/bus/coresight/devices # echo 1 > e3c7c000.ptm/enable_source
[   59.934255] coresight-etb10 e3c42000.etb: ETB enabled
[   59.951317] coresight-replicator replicator0: REPLICATOR enabled
[   59.971581] coresight-funnel e3c41000.funnel: FUNNEL inport 0 enabled
[   59.993334] coresight-etm3x e3c7c000.ptm: ETM tracing enabled

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 22394bc58543639e5135f19eee2b03d14e4a9b66)
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
AndreiLux pushed a commit that referenced this issue Apr 11, 2015
commit 045c47ca306acf30c740c285a77a4b4bda6be7c5 upstream.

When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.

Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.

[ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
[ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
[ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
[ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
[ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
[ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
[ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
[ 1137.734202] REGS: c000000f1d213500 TRAP: 0300   Not tainted  (3.19.0)
[ 1137.734230] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 42008884  XER: 20000000
[ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
[ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
[ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
[ 1137.734943] Call Trace:
[ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
[ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
[ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
[ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
[ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
[ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
[ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
[ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
[ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
[ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
[ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
[ 1137.735383] Instruction dump:
[ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
[ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 <7cead02a> e9090008 e9490010 e9290018

And here is one code that allows to easily reproduce this, although this
has first been found by running docker.

void run(pid_t pid)
{
	int n;
	int status;
	int fd;
	char *buffer;
	buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
	n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
	fd = open(CGPATH "/test/tasks", O_WRONLY);
	write(fd, buffer, n);
	close(fd);
	if (fork() > 0) {
		fd = open("/dev/sda", O_RDONLY | O_DIRECT);
		read(fd, buffer, 512);
		close(fd);
		wait(&status);
	} else {
		fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
		n = read(fd, buffer, BUFFER_SIZE);
		close(fd);
	}
	free(buffer);
	exit(0);
}

void test(void)
{
	int status;
	mkdir(CGPATH "/test", 0666);
	if (fork() > 0)
		wait(&status);
	else
		run(getpid());
	rmdir(CGPATH "/test");
}

int main(int argc, char **argv)
{
	int i;
	for (i = 0; i < NR_TESTS; i++)
		test();
	return 0;
}

Reported-by: Ricardo Marin Matinata <rmm@br.ibm.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 16, 2015
[ Upstream commit b1cb59cf2efe7971d3d72a7b963d09a512d994c9 ]

sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
set to incorrect values. Given that 'struct sk_buff' allocates from
rcvbuf, incorrectly set buffer length could result to memory
allocation failures. For example, set them as follows:

    # sysctl net.core.rmem_default=64
      net.core.wmem_default = 64
    # sysctl net.core.wmem_default=64
      net.core.wmem_default = 64
    # ping localhost -s 1024 -i 0 > /dev/null

This could result to the following failure:

skbuff: skb_over_panic: text:ffffffff81628db4 len:-32 put:-32
head:ffff88003a1cc200 data:ffff88003a1cc200 tail:0xffffffe0 end:0xc0 dev:<NULL>
kernel BUG at net/core/skbuff.c:102!
invalid opcode: 0000 [#1] SMP
...
task: ffff88003b7f5550 ti: ffff88003ae88000 task.ti: ffff88003ae88000
RIP: 0010:[<ffffffff8155fbd1>]  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
RSP: 0018:ffff88003ae8bc68  EFLAGS: 00010296
RAX: 000000000000008d RBX: 00000000ffffffe0 RCX: 0000000000000000
RDX: ffff88003fdcf598 RSI: ffff88003fdcd9c8 RDI: ffff88003fdcd9c8
RBP: ffff88003ae8bc88 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 00000000000002b2 R12: 0000000000000000
R13: 0000000000000000 R14: ffff88003d3f7300 R15: ffff88000012a900
FS:  00007fa0e2b4a840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000d0f7e0 CR3: 000000003b8fb000 CR4: 00000000000006f0
Stack:
 ffff88003a1cc200 00000000ffffffe0 00000000000000c0 ffffffff818cab1d
 ffff88003ae8bd68 ffffffff81628db4 ffff88003ae8bd48 ffff88003b7f5550
 ffff880031a09408 ffff88003b7f5550 ffff88000012aa48 ffff88000012ab00
Call Trace:
 [<ffffffff81628db4>] unix_stream_sendmsg+0x2c4/0x470
 [<ffffffff81556f56>] sock_write_iter+0x146/0x160
 [<ffffffff811d9612>] new_sync_write+0x92/0xd0
 [<ffffffff811d9cd6>] vfs_write+0xd6/0x180
 [<ffffffff811da499>] SyS_write+0x59/0xd0
 [<ffffffff81651532>] system_call_fastpath+0x12/0x17
Code: 00 00 48 89 44 24 10 8b 87 c8 00 00 00 48 89 44 24 08 48 8b 87 d8 00
      00 00 48 c7 c7 30 db 91 81 48 89 04 24 31 c0 e8 4f a8 0e 00 <0f> 0b
      eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83
RIP  [<ffffffff8155fbd1>] skb_put+0xa1/0xb0
RSP <ffff88003ae8bc68>
Kernel panic - not syncing: Fatal exception

Moreover, the possible minimum is 1, so we can get another kernel panic:
...
BUG: unable to handle kernel paging request at ffff88013caee5c0
IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
...

Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 16, 2015
commit a28b2a47edcd0cb7c051b445f71a426000394606 upstream.

Passing zeroed drm_radeon_cs struct to DRM_IOCTL_RADEON_CS produces the
following oops.

Fix by always calling INIT_LIST_HEAD() to avoid the crash in list_sort().

----------------------------------

 #include <stdint.h>
 #include <fcntl.h>
 #include <unistd.h>
 #include <sys/ioctl.h>
 #include <drm/radeon_drm.h>

 static const struct drm_radeon_cs cs;

 int main(int argc, char **argv)
 {
         return ioctl(open(argv[1], O_RDWR), DRM_IOCTL_RADEON_CS, &cs);
 }

----------------------------------

[ttrantal@test2 ~]$ ./main /dev/dri/card0
[   46.904650] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   46.905022] IP: [<ffffffff814d6df2>] list_sort+0x42/0x240
[   46.905022] PGD 68f29067 PUD 688b5067 PMD 0
[   46.905022] Oops: 0002 [#1] SMP
[   46.905022] CPU: 0 PID: 2413 Comm: main Not tainted 4.0.0-rc1+ #58
[   46.905022] Hardware name: Hewlett-Packard HP Compaq dc5750 Small Form Factor/0A64h, BIOS 786E3 v02.10 01/25/2007
[   46.905022] task: ffff880058e2bcc0 ti: ffff880058e64000 task.ti: ffff880058e64000
[   46.905022] RIP: 0010:[<ffffffff814d6df2>]  [<ffffffff814d6df2>] list_sort+0x42/0x240
[   46.905022] RSP: 0018:ffff880058e67998  EFLAGS: 00010246
[   46.905022] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   46.905022] RDX: ffffffff81644410 RSI: ffff880058e67b40 RDI: ffff880058e67a58
[   46.905022] RBP: ffff880058e67a88 R08: 0000000000000000 R09: 0000000000000000
[   46.905022] R10: ffff880058e2bcc0 R11: ffffffff828e6ca0 R12: ffffffff81644410
[   46.905022] R13: ffff8800694b8018 R14: 0000000000000000 R15: ffff880058e679b0
[   46.905022] FS:  00007fdc65a65700(0000) GS:ffff88006d600000(0000) knlGS:0000000000000000
[   46.905022] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   46.905022] CR2: 0000000000000000 CR3: 0000000058dd9000 CR4: 00000000000006f0
[   46.905022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   46.905022] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
[   46.905022] Stack:
[   46.905022]  ffff880058e67b40 ffff880058e2bcc0 ffff880058e67a78 0000000000000000
[   46.905022]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   46.905022]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   46.905022] Call Trace:
[   46.905022]  [<ffffffff81644a65>] radeon_cs_parser_fini+0x195/0x220
[   46.905022]  [<ffffffff81645069>] radeon_cs_ioctl+0xa9/0x960
[   46.905022]  [<ffffffff815e1f7c>] drm_ioctl+0x19c/0x640
[   46.905022]  [<ffffffff810f8fdd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[   46.905022]  [<ffffffff810f90ad>] ? trace_hardirqs_on+0xd/0x10
[   46.905022]  [<ffffffff8160c066>] radeon_drm_ioctl+0x46/0x80
[   46.905022]  [<ffffffff81211868>] do_vfs_ioctl+0x318/0x570
[   46.905022]  [<ffffffff81462ef6>] ? selinux_file_ioctl+0x56/0x110
[   46.905022]  [<ffffffff81211b41>] SyS_ioctl+0x81/0xa0
[   46.905022]  [<ffffffff81dc6312>] system_call_fastpath+0x12/0x17
[   46.905022] Code: 48 89 b5 10 ff ff ff 0f 84 03 01 00 00 4c 8d bd 28 ff ff
ff 31 c0 48 89 fb b9 15 00 00 00 49 89 d4 4c 89 ff f3 48 ab 48 8b 46 08 <48> c7
00 00 00 00 00 48 8b 0e 48 85 c9 0f 84 7d 00 00 00 c7 85
[   46.905022] RIP  [<ffffffff814d6df2>] list_sort+0x42/0x240
[   46.905022]  RSP <ffff880058e67998>
[   46.905022] CR2: 0000000000000000
[   47.149253] ---[ end trace 09576b4e8b2c20b8 ]---

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndreiLux pushed a commit that referenced this issue Apr 16, 2015
commit 6302ce4d80aa82b3fdb5c5cd68e7268037091b47 upstream.

This crash was reported:

[  366.947370] sd 3:0:1:0: [sdb] Spinning up disk....
[  368.804046] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  368.804072] IP: [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
[  368.804098] PGD 0
[  368.804114] Oops: 0002 [#1] SMP
[  368.804143] CPU 1
[  368.804151] Modules linked in: sg netconsole s3g(PO) uinput joydev hid_multitouch usbhid hid snd_hda_codec_via cpufreq_userspace cpufreq_powersave cpufreq_stats uhci_hcd cpufreq_conservative snd_hda_intel snd_hda_codec snd_hwdep snd_pcm sdhci_pci snd_page_alloc sdhci snd_timer snd psmouse evdev serio_raw pcspkr soundcore xhci_hcd shpchp s3g_drm(O) mvsas mmc_core ahci libahci drm i2c_core acpi_cpufreq mperf video processor button thermal_sys dm_dmirror exfat_fs exfat_core dm_zcache dm_mod padlock_aes aes_generic padlock_sha iscsi_target_mod target_core_mod configfs sswipe libsas libata scsi_transport_sas picdev via_cputemp hwmon_vid fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 sd_mod crc_t10dif usb_storage scsi_mod ehci_hcd usbcore usb_common
[  368.804749]
[  368.804764] Pid: 392, comm: kworker/u:3 Tainted: P        W  O 3.4.87-logicube-ng.22 #1 To be filled by O.E.M. To be filled by O.E.M./EPIA-M920
[  368.804802] RIP: 0010:[<ffffffff81358457>]  [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
[  368.804827] RSP: 0018:ffff880117001cc0  EFLAGS: 00010246
[  368.804842] RAX: 0000000000000000 RBX: ffff8801185030d0 RCX: ffff88008edcb420
[  368.804857] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff8801185030d4
[  368.804873] RBP: ffff8801181531c0 R08: 0000000000000020 R09: 00000000fffffffe
[  368.804885] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801185030d4
[  368.804899] R13: 0000000000000002 R14: ffff880117001fd8 R15: ffff8801185030d8
[  368.804916] FS:  0000000000000000(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[  368.804931] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  368.804946] CR2: 0000000000000000 CR3: 000000000160b000 CR4: 00000000000006e0
[  368.804962] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  368.804978] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  368.804995] Process kworker/u:3 (pid: 392, threadinfo ffff880117000000, task ffff8801181531c0)
[  368.805009] Stack:
[  368.805017]  ffff8801185030d8 0000000000000000 ffffffff8161ddf0 ffffffff81056f7c
[  368.805062]  000000000000b503 ffff8801185030d0 ffff880118503000 0000000000000000
[  368.805100]  ffff8801185030d0 ffff8801188b8000 ffff88008edcb420 ffffffff813583ac
[  368.805135] Call Trace:
[  368.805153]  [<ffffffff81056f7c>] ? up+0xb/0x33
[  368.805168]  [<ffffffff813583ac>] ? mutex_lock+0x16/0x25
[  368.805194]  [<ffffffffa018c414>] ? smp_execute_task+0x4e/0x222 [libsas]
[  368.805217]  [<ffffffffa018ce1c>] ? sas_find_bcast_dev+0x3c/0x15d [libsas]
[  368.805240]  [<ffffffffa018ce4f>] ? sas_find_bcast_dev+0x6f/0x15d [libsas]
[  368.805264]  [<ffffffffa018e989>] ? sas_ex_revalidate_domain+0x37/0x2ec [libsas]
[  368.805280]  [<ffffffff81355a2a>] ? printk+0x43/0x48
[  368.805296]  [<ffffffff81359a65>] ? _raw_spin_unlock_irqrestore+0xc/0xd
[  368.805318]  [<ffffffffa018b767>] ? sas_revalidate_domain+0x85/0xb6 [libsas]
[  368.805336]  [<ffffffff8104e5d9>] ? process_one_work+0x151/0x27c
[  368.805351]  [<ffffffff8104f6cd>] ? worker_thread+0xbb/0x152
[  368.805366]  [<ffffffff8104f612>] ? manage_workers.isra.29+0x163/0x163
[  368.805382]  [<ffffffff81052c4e>] ? kthread+0x79/0x81
[  368.805399]  [<ffffffff8135fea4>] ? kernel_thread_helper+0x4/0x10
[  368.805416]  [<ffffffff81052bd5>] ? kthread_flush_work_fn+0x9/0x9
[  368.805431]  [<ffffffff8135fea0>] ? gs_change+0x13/0x13
[  368.805442] Code: 83 7d 30 63 7e 04 f3 90 eb ab 4c 8d 63 04 4c 8d 7b 08 4c 89 e7 e8 fa 15 00 00 48 8b 43 10 4c 89 3c 24 48 89 63 10 48 89 44 24 08 <48> 89 20 83 c8 ff 48 89 6c 24 10 87 03 ff c8 74 35 4d 89 ee 41
[  368.805851] RIP  [<ffffffff81358457>] __mutex_lock_common.isra.7+0x9c/0x15b
[  368.805877]  RSP <ffff880117001cc0>
[  368.805886] CR2: 0000000000000000
[  368.805899] ---[ end trace b720682065d8f4cc ]---

It's directly caused by 89d3cf6 [SCSI] libsas: add mutex for SMP task
execution, but shows a deeper cause: expander functions expect to be able to
cast to and treat domain devices as expanders.  The correct fix is to only do
expander discover when we know we've got an expander device to avoid wrongly
casting a non-expander device.

Reported-by: Praveen Murali <pmurali@logicube.com>
Tested-by: Praveen Murali <pmurali@logicube.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants