[pull] master from torvalds:master #31

pull · 2018-12-01T20:25:08Z

See Commits and Changes for more details.

This bug was introduced in the interaction for two commits on either branch of the merge commit 562df5c ("Merge branch 'pci/host-designware' into next"). Commit 4d107d3 ("PCI: imx6: Move link up check into imx6_pcie_wait_for_link()"), changed imx6_pcie_wait_for_link() to poll the link status register directly, checking for link up and not training, and made imx6_pcie_link_up() only check the link up bit (once, not a polling loop). While commit 886bc5c ("PCI: designware: Add generic dw_pcie_wait_for_link()"), replaced the loop in imx6_pcie_wait_for_link() with a call to a new dwc core function, which polled imx6_pcie_link_up(), which still checked both link up and not training in a loop. When these two commits were merged, the version of imx6_pcie_wait_for_link() from 886bc5c was kept, which eliminated the link training check placed there by 4d107d3. However, the version of imx6_pcie_link_up() from 4d107d3 was kept, which eliminated the link training check that had been there and was moved to imx6_pcie_wait_for_link(). The result was the link training check got lost for the imx6 driver. Eliminate imx6_pcie_link_up() so that the default handler, dw_pcie_link_up(), is used instead. The default handler has the correct code, which checks for link up and also that it still is not training, fixing the regression. Fixes: 562df5c ("Merge branch 'pci/host-designware' into next") Signed-off-by: Trent Piepho <tpiepho@impinj.com> [lorenzo.pieralisi@arm.com: rewrote the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Joao Pinto <Joao.Pinto@synopsys.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Richard Zhu <hongxing.zhu@nxp.com>

…ssor The order of parameters is not correct when invoking the outbound window disable routine. Fix it. Fixes: 4a2745d ("PCI: layerscape: Disable outbound windows configured by bootloader") Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> [lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org

Free the kobject name that was allocated for the controller device on failure rather than its parent. Fixes: d22524a ("nvme: switch controller refcounting to use struct device") Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>

Fix an error caused by 3-bit right rotation on offset address calculation of MSI-X table in dw_pcie_ep_raise_msix_irq(). The initial testing code was setting by default the offset address of MSI-X table to zero, so that even with a 3-bit right rotation the computed result would still be zero and valid, therefore this bug went unnoticed. Fixes: beb4641 ("PCI: dwc: Add MSI-X callbacks handler") Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org

…equest() __nvme_fc_init_request() invokes memset() on the nvme_fcp_op_w_sgl structure, which NULLed-out the nvme_req(req)->ctrl field previously set by nvme_fc_init_request(). This apparently was not referenced until commit faf4a44fff ("nvme: support traffic based keep-alive") which now results in a crash in nvme_complete_rq(): [ 8386.897130] RIP: 0010:panic+0x220/0x26c [ 8386.901406] Code: 83 3d 6f ee 72 01 00 74 05 e8 e8 54 02 00 48 c7 c6 40 fd 5b b4 48 c7 c7 d8 8d c6 b3 31e [ 8386.922359] RSP: 0018:ffff99650019fc40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 8386.930804] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000006 [ 8386.938764] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8e325f8168b0 [ 8386.946725] RBP: ffff99650019fcb0 R08: 0000000000000000 R09: 00000000000004f8 [ 8386.954687] R10: 0000000000000000 R11: ffff99650019f9b8 R12: ffffffffb3c55f3c [ 8386.962648] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 [ 8386.970613] oops_end+0xd1/0xe0 [ 8386.974116] no_context+0x1b2/0x3c0 [ 8386.978006] do_page_fault+0x32/0x140 [ 8386.982090] page_fault+0x1e/0x30 [ 8386.985786] RIP: 0010:nvme_complete_rq+0x65/0x1d0 [nvme_core] [ 8386.992195] Code: 41 bc 03 00 00 00 74 16 0f 86 c3 00 00 00 66 3d 83 00 41 bc 06 00 00 00 0f 85 e7 00 000 [ 8387.013147] RSP: 0018:ffff99650019fe18 EFLAGS: 00010246 [ 8387.018973] RAX: 0000000000000000 RBX: ffff8e322ae51280 RCX: 0000000000000001 [ 8387.026935] RDX: 0000000000000400 RSI: 0000000000000001 RDI: ffff8e322ae51280 [ 8387.034897] RBP: ffff8e322ae51280 R08: 0000000000000000 R09: ffffffffb2f0b890 [ 8387.042859] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 8387.050821] R13: 0000000000000100 R14: 0000000000000004 R15: ffff8e2b0446d990 [ 8387.058782] ? swiotlb_unmap_page+0x40/0x40 [ 8387.063448] nvme_fc_complete_rq+0x2d/0x70 [nvme_fc] [ 8387.068986] blk_done_softirq+0xa1/0xd0 [ 8387.073264] __do_softirq+0xd6/0x2a9 [ 8387.077251] run_ksoftirqd+0x26/0x40 [ 8387.081238] smpboot_thread_fn+0x10e/0x160 [ 8387.085807] kthread+0xf8/0x130 [ 8387.089309] ? sort_range+0x20/0x20 [ 8387.093198] ? kthread_stop+0x110/0x110 [ 8387.097475] ret_from_fork+0x35/0x40 [ 8387.101462] ---[ end trace 7106b0adf5e422f8 ]--- Fixes: faf4a44fff ("nvme: support traffic based keep-alive") Signed-off-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>

When a PCIe NVMe device is not present, nvme_dev_remove_admin() calls blk_cleanup_queue() on the admin queue, which frees the hctx for that queue. Moments later, on the same path nvme_kill_queues() calls blk_mq_unquiesce_queue() on admin queue and tries to access hctx of it, which leads to following OOPS: Oops: 0000 [#1] SMP PTI RIP: 0010:sbitmap_any_bit_set+0xb/0x40 Call Trace: blk_mq_run_hw_queue+0xd5/0x150 blk_mq_run_hw_queues+0x3a/0x50 nvme_kill_queues+0x26/0x50 nvme_remove_namespaces+0xb2/0xc0 nvme_remove+0x60/0x140 pci_device_remove+0x3b/0xb0 Fixes: cb4bfda ("nvme-pci: fix hot removal during error handling") Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>

commit e259221 ("fs: simplify the generic_write_sync prototype") reworked callers of generic_write_sync(), and ended up dropping the error return for the directio path. Prior to that commit, in dio_complete(), an error would be bubbled up the stack, but after that commit, errors passed on to dio_complete were eaten up. This was reported on the list earlier, and a fix was proposed in https://lore.kernel.org/lkml/20160921141539.GA17898@infradead.org/, but never followed up with. We recently hit this bug in our testing where fencing io errors, which were previously erroring out with EIO, were being returned as success operations after this commit. The fix proposed on the list earlier was a little short -- it would have still called generic_write_sync() in case `ret` already contained an error. This fix ensures generic_write_sync() is only called when there's no pending error in the write. Additionally, transferred is replaced with ret to bring this code in line with other callers. Fixes: e259221 ("fs: simplify the generic_write_sync prototype") Reported-by: Ravi Nankani <rnankani@amazon.com> Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Christoph Hellwig <hch@lst.de> CC: Torsten Mehlan <tomeh@amazon.de> CC: Uwe Dannowski <uwed@amazon.de> CC: Amit Shah <aams@amazon.de> CC: David Woodhouse <dwmw@amazon.co.uk> CC: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>

…abled Without CONFIG_NVME_MULTIPATH enabled a multi-port subsystem might show up as invididual devices and cause problems, warn about it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

nvme_stop_ctrl can be called also for reset flow and there is no need to flush the scan_work as namespaces are not being removed. This can cause deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers before controller teardown (and specifically I/O cancellation of the scan_work itself) takes place, but the scan_work will be blocked anyways so there is no need to flush it. Instead, move scan_work flush to nvme_remove_namespaces() where it really needs to flush. Reported-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed by: James Smart <jsmart2021@gmail.com> Tested-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>

Some error paths in configuration of admin queue free data buffer associated with async request SQE without resetting the data buffer pointer to NULL, This buffer is also freed up again if the controller is shutdown or reset. Signed-off-by: Prabhath Sajeepa <psajeepa@purestorage.com> Reviewed-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Christoph Hellwig <hch@lst.de>

There are actually two kinds of discard merge: - one is the normal discard merge, just like normal read/write request, and call it single-range discard - another is the multi-range discard, queue_max_discard_segments(rq->q) > 1 For the former case, queue_max_discard_segments(rq->q) is 1, and we should handle this kind of discard merge like the normal read/write request. This patch fixes the following kernel panic issue[1], which is caused by not removing the single-range discard request from elevator queue. Guangwu has one raid discard test case, in which this issue is a bit easier to trigger, and I verified that this patch can fix the kernel panic issue in Guangwu's test case. [1] kernel panic log from Jens's report BUG: unable to handle kernel NULL pointer dereference at 0000000000000148 PGD 0 P4D 0. Oops: 0000 [#1] SMP PTI CPU: 37 PID: 763 Comm: kworker/37:1H Not tainted \ 4.20.0-rc3-00649-ge64d9a554a91-dirty #14 Hardware name: Wiwynn \ Leopard-Orv2/Leopard-DDR BW, BIOS LBM08 03/03/2017 Workqueue: kblockd \ blk_mq_run_work_fn RIP: \ 0010:blk_mq_get_driver_tag+0x81/0x120 Code: 24 \ 10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 00 00 00 \ 0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 20 72 37 \ f6 87 b0 00 00 00 02 RSP: 0018:ffffc90004aabd30 EFLAGS: 00010246 \ RAX: 0000000000000003 RBX: ffff888465ea1300 RCX: ffffc90004aabde8 RDX: 00000000ffffffff RSI: ffffc90004aabde8 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff888465ea1348 R09: 0000000000000000 R10: 0000000000001000 R11: 00000000ffffffff R12: ffff888465ea1300 R13: 0000000000000000 R14: ffff888465ea1348 R15: ffff888465d10000 FS: 0000000000000000(0000) GS:ffff88846f9c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000148 CR3: 000000000220a003 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: blk_mq_dispatch_rq_list+0xec/0x480 ? elv_rb_del+0x11/0x30 blk_mq_do_dispatch_sched+0x6e/0xf0 blk_mq_sched_dispatch_requests+0xfa/0x170 __blk_mq_run_hw_queue+0x5f/0xe0 process_one_work+0x154/0x350 worker_thread+0x46/0x3c0 kthread+0xf5/0x130 ? process_one_work+0x350/0x350 ? kthread_destroy_worker+0x50/0x50 ret_from_fork+0x1f/0x30 Modules linked in: sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel \ kvm switchtec irqbypass iTCO_wdt iTCO_vendor_support efivars cdc_ether usbnet mii \ cdc_acm i2c_i801 lpc_ich mfd_core ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq \ button sch_fq_codel nfsd nfs_acl lockd grace auth_rpcgss oid_registry sunrpc nvme \ nvme_core fuse sg loop efivarfs autofs4 CR2: 0000000000000148 \ ---[ end trace 340a1fb996df1b9b ]--- RIP: 0010:blk_mq_get_driver_tag+0x81/0x120 Code: 24 10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 \ 00 00 00 0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 \ 20 72 37 f6 87 b0 00 00 00 02 Fixes: 445251d ("blk-mq: fix discard merge with scheduler attached") Reported-by: Jens Axboe <axboe@kernel.dk> Cc: Guangwu Zhang <guazhang@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Pull NVMe fixes from Christoph: "Various fixlets all over." * 'nvme-4.20' of git://git.infradead.org/nvme: nvme-rdma: fix double freeing of async event data nvme: flush namespace scanning work just before removing namespaces nvme: warn when finding multi-port subsystems without multipathing enabled nvme-pci: fix surprise removal nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request() nvme: Free ctrl device name on init failure

The macros PCI_EXP_LNKCAP_SLS_*GB are values, not bit masks. We must mask the register and compare it against them. This fixes errors like this: amdgpu: [powerplay] failed to send message 261 ret is 0 when a PCIe-v3 card is plugged into a PCIe-v1 slot, because the slot is being incorrectly reported as PCIe-v3 capable. 6cf57be, which appeared in v4.17, added pcie_get_speed_cap() with the incorrect test of PCI_EXP_LNKCAP_SLS as a bitmask. 5d9a633, which appeared in v4.19, changed amdgpu to use pcie_get_speed_cap(), so the amdgpu bug reports below are regressions in v4.19. Fixes: 6cf57be ("PCI: Add pcie_get_speed_cap() to find max supported link speed") Fixes: 5d9a633 ("drm/amdgpu: use pcie functions for link width and speed") Link: https://bugs.freedesktop.org/show_bug.cgi?id=108704 Link: https://bugs.freedesktop.org/show_bug.cgi?id=108778 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> [bhelgaas: update comment, remove use of PCI_EXP_LNKCAP_SLS_8_0GB and PCI_EXP_LNKCAP_SLS_16_0GB since those should be covered by PCI_EXP_LNKCAP2, remove test of PCI_EXP_LNKCAP for zero, since that register is required] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # v4.17+

…linus - Fix DesignWare endpoint MSI-X address calculation bug (Gustavo Pimentel) - Fix Layerscape outbound window disable usage (Hou Zhiqiang) - Fix imx6 link up detection (Trent Piepho) * lorenzo/pci/controller-fixes: PCI: dwc: Fix MSI-X EP framework address calculation bug PCI: layerscape: Fix wrong invocation of outbound window disable accessor PCI: imx6: Fix link training status detection in link up check

…ernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix a link speed checking interface that broke PCIe gen3 cards in gen1 slots (Mikulas Patocka) - Fix an imx6 link training error (Trent Piepho) - Fix a layerscape outbound window accessor calling error (Hou Zhiqiang) - Fix a DesignWare endpoint MSI-X address calculation error (Gustavo Pimentel) * tag 'pci-v4.20-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Fix incorrect value returned from pcie_get_speed_cap() PCI: dwc: Fix MSI-X EP framework address calculation bug PCI: layerscape: Fix wrong invocation of outbound window disable accessor PCI: imx6: Fix link training status detection in link up check

Pull block layer fixes from Jens Axboe: - Single range elevator discard merge fix, that caused crashes (Ming) - Fix for a regression in O_DIRECT, where we could potentially lose the error value (Maximilian Heyne) - NVMe pull request from Christoph, with little fixes all over the map for NVMe. * tag 'for-linus-20181201' of git://git.kernel.dk/linux-block: block: fix single range discard merge nvme-rdma: fix double freeing of async event data nvme: flush namespace scanning work just before removing namespaces nvme: warn when finding multi-port subsystems without multipathing enabled fs: fix lost error code in dio_complete nvme-pci: fix surprise removal nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request() nvme: Free ctrl device name on init failure

When using direct commands (DCMDs) on an RK3399, we get spurious CQE completion interrupts for the DCMD transaction slot (#31): [ 931.196520] ------------[ cut here ]------------ [ 931.201702] mmc1: cqhci: spurious TCN for tag 31 [ 931.206906] WARNING: CPU: 0 PID: 1433 at /usr/src/kernel/drivers/mmc/host/cqhci.c:725 cqhci_irq+0x2e4/0x490 [ 931.206909] Modules linked in: [ 931.206918] CPU: 0 PID: 1433 Comm: irq/29-mmc1 Not tainted 4.19.8-rt6-funkadelic #1 [ 931.206920] Hardware name: Theobroma Systems RK3399-Q7 SoM (DT) [ 931.206924] pstate: 40000005 (nZcv daif -PAN -UAO) [ 931.206927] pc : cqhci_irq+0x2e4/0x490 [ 931.206931] lr : cqhci_irq+0x2e4/0x490 [ 931.206933] sp : ffff00000e54bc80 [ 931.206934] x29: ffff00000e54bc80 x28: 0000000000000000 [ 931.206939] x27: 0000000000000001 x26: ffff000008f217e8 [ 931.206944] x25: ffff8000f02ef030 x24: ffff0000091417b0 [ 931.206948] x23: ffff0000090aa000 x22: ffff8000f008b000 [ 931.206953] x21: 0000000000000002 x20: 000000000000001f [ 931.206957] x19: ffff8000f02ef018 x18: ffffffffffffffff [ 931.206961] x17: 0000000000000000 x16: 0000000000000000 [ 931.206966] x15: ffff0000090aa6c8 x14: 0720072007200720 [ 931.206970] x13: 0720072007200720 x12: 0720072007200720 [ 931.206975] x11: 0720072007200720 x10: 0720072007200720 [ 931.206980] x9 : 0720072007200720 x8 : 0720072007200720 [ 931.206984] x7 : 0720073107330720 x6 : 00000000000005a0 [ 931.206988] x5 : ffff00000860d4b0 x4 : 0000000000000000 [ 931.206993] x3 : 0000000000000001 x2 : 0000000000000001 [ 931.206997] x1 : 1bde3a91b0d4d900 x0 : 0000000000000000 [ 931.207001] Call trace: [ 931.207005] cqhci_irq+0x2e4/0x490 [ 931.207009] sdhci_arasan_cqhci_irq+0x5c/0x90 [ 931.207013] sdhci_irq+0x98/0x930 [ 931.207019] irq_forced_thread_fn+0x2c/0xa0 [ 931.207023] irq_thread+0x114/0x1c0 [ 931.207027] kthread+0x128/0x130 [ 931.207032] ret_from_fork+0x10/0x20 [ 931.207035] ---[ end trace 0000000000000002 ]--- The driver shows this message only for the first spurious interrupt by using WARN_ONCE(). Changing this to WARN() shows, that this is happening quite frequently (up to once a second). Since the eMMC 5.1 specification, where CQE and CQHCI are specified, does not mention that spurious TCN interrupts for DCMDs can be simply ignored, we must assume that using this feature is not working reliably. The current implementation uses DCMD for REQ_OP_FLUSH only, and I could not see any performance/power impact when disabling this optional feature for RK3399. Therefore this patch disables DCMDs for RK3399. Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com> Fixes: 84362d7 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1") Cc: stable@vger.kernel.org [the corresponding code changes are queued for 5.2 so doing that as well] Signed-off-by: Heiko Stuebner <heiko@sntech.de>

Setting invalid value to /sys/devices/system/cpu/cpuX/hotplug/fail can control `struct cpuhp_step *sp` address, results in the following global-out-of-bounds read. Reproducer: # echo -2 > /sys/devices/system/cpu/cpu0/hotplug/fail KASAN report: BUG: KASAN: global-out-of-bounds in write_cpuhp_fail+0x2cd/0x2e0 Read of size 8 at addr ffffffff89734438 by task bash/1941 CPU: 0 PID: 1941 Comm: bash Not tainted 5.2.0-rc6+ #31 Call Trace: write_cpuhp_fail+0x2cd/0x2e0 dev_attr_store+0x58/0x80 sysfs_kf_write+0x13d/0x1a0 kernfs_fop_write+0x2bc/0x460 vfs_write+0x1e1/0x560 ksys_write+0x126/0x250 do_syscall_64+0xc1/0x390 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f05e4f4c970 The buggy address belongs to the variable: cpu_hotplug_lock+0x98/0xa0 Memory state around the buggy address: ffffffff89734300: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00 ffffffff89734380: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00 >ffffffff89734400: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa ^ ffffffff89734480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffff89734500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Add a sanity check for the value written from user space. Fixes: 1db4948 ("smp/hotplug: Hotplug state fail injection") Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Link: https://lkml.kernel.org/r/20190627024732.31672-1-devel@etsukata.com

Original version of g920_get_config() contained two kind of actions: 1. Device specific communication to query/set some parameters which requires active communication channel with the device, or, put in other way, for the call to be sandwiched between hid_device_io_start() and hid_device_io_stop(). 2. Input subsystem specific FF controller initialization which, in order to access a valid 'struct hid_input' via 'hid->inputs.next', requires claimed hidinput which means be executed after the call to hid_hw_start() with connect_mask containing HID_CONNECT_HIDINPUT. Location of g920_get_config() can only fulfill requirements for #1 and not #2, which might result in following backtrace: [ 88.312258] logitech-hidpp-device 0003:046D:C262.0005: HID++ 4.2 device connected. [ 88.320298] BUG: kernel NULL pointer dereference, address: 0000000000000018 [ 88.320304] #PF: supervisor read access in kernel mode [ 88.320307] #PF: error_code(0x0000) - not-present page [ 88.320309] PGD 0 P4D 0 [ 88.320315] Oops: 0000 [#1] SMP PTI [ 88.320320] CPU: 1 PID: 3080 Comm: systemd-udevd Not tainted 5.4.0-rc1+ #31 [ 88.320322] Hardware name: Apple Inc. MacBookPro11,1/Mac-189A3D4F975D5FFC, BIOS 149.0.0.0.0 09/17/2018 [ 88.320334] RIP: 0010:hidpp_probe+0x61f/0x948 [hid_logitech_hidpp] [ 88.320338] Code: 81 00 00 48 89 ef e8 f0 d6 ff ff 41 89 c6 85 c0 75 b5 0f b6 44 24 28 48 8b 5d 00 88 44 24 1e 89 44 24 0c 48 8b 83 18 1c 00 00 <48> 8b 48 18 48 8b 83 10 19 00 00 48 8b 40 40 48 89 0c 24 0f b7 80 [ 88.320341] RSP: 0018:ffffb0a6824aba68 EFLAGS: 00010246 [ 88.320345] RAX: 0000000000000000 RBX: ffff93a50756e000 RCX: 0000000000010408 [ 88.320347] RDX: 0000000000000000 RSI: ffff93a51f0ad0a0 RDI: 000000000002d0a0 [ 88.320350] RBP: ffff93a50416da28 R08: ffff93a50416da70 R09: ffff93a50416da70 [ 88.320352] R10: 000000148ae9e60c R11: 00000000000f1525 R12: ffff93a50756e000 [ 88.320354] R13: ffff93a50756f8d0 R14: 0000000000000000 R15: ffff93a50756fc38 [ 88.320358] FS: 00007f8d8c1e0940(0000) GS:ffff93a51f080000(0000) knlGS:0000000000000000 [ 88.320361] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.320363] CR2: 0000000000000018 CR3: 00000003996d8003 CR4: 00000000001606e0 [ 88.320366] Call Trace: [ 88.320377] ? _cond_resched+0x15/0x30 [ 88.320387] ? create_pinctrl+0x2f/0x3c0 [ 88.320393] ? kernfs_link_sibling+0x94/0xe0 [ 88.320398] ? _cond_resched+0x15/0x30 [ 88.320402] ? kernfs_activate+0x5f/0x80 [ 88.320406] ? kernfs_add_one+0xe2/0x130 [ 88.320411] hid_device_probe+0x106/0x170 [ 88.320419] really_probe+0x147/0x3c0 [ 88.320424] driver_probe_device+0xb6/0x100 [ 88.320428] device_driver_attach+0x53/0x60 [ 88.320433] __driver_attach+0x8a/0x150 [ 88.320437] ? device_driver_attach+0x60/0x60 [ 88.320440] bus_for_each_dev+0x78/0xc0 [ 88.320445] bus_add_driver+0x14d/0x1f0 [ 88.320450] driver_register+0x6c/0xc0 [ 88.320453] ? 0xffffffffc0d67000 [ 88.320457] __hid_register_driver+0x4c/0x80 [ 88.320464] do_one_initcall+0x46/0x1f4 [ 88.320469] ? _cond_resched+0x15/0x30 [ 88.320474] ? kmem_cache_alloc_trace+0x162/0x220 [ 88.320481] ? do_init_module+0x23/0x230 [ 88.320486] do_init_module+0x5c/0x230 [ 88.320491] load_module+0x26e1/0x2990 [ 88.320502] ? ima_post_read_file+0xf0/0x100 [ 88.320508] ? __do_sys_finit_module+0xaa/0x110 [ 88.320512] __do_sys_finit_module+0xaa/0x110 [ 88.320520] do_syscall_64+0x5b/0x180 [ 88.320525] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 88.320528] RIP: 0033:0x7f8d8d1f01fd [ 88.320532] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5b 8c 0c 00 f7 d8 64 89 01 48 [ 88.320535] RSP: 002b:00007ffefa3bb068 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 88.320539] RAX: ffffffffffffffda RBX: 000055922040cb40 RCX: 00007f8d8d1f01fd [ 88.320541] RDX: 0000000000000000 RSI: 00007f8d8ce4984d RDI: 0000000000000006 [ 88.320543] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000007 [ 88.320545] R10: 0000000000000006 R11: 0000000000000246 R12: 00007f8d8ce4984d [ 88.320547] R13: 0000000000000000 R14: 000055922040efc0 R15: 000055922040cb40 [ 88.320551] Modules linked in: hid_logitech_hidpp(+) fuse rfcomm ccm xt_CHECKSUM xt_MASQUERADE bridge stp llc nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat tun iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc dm_crypt nls_utf8 hfsplus intel_rapl_msr intel_rapl_common ath9k_htc ath9k_common x86_pkg_temp_thermal intel_powerclamp b43 ath9k_hw coretemp snd_hda_codec_hdmi cordic kvm_intel snd_hda_codec_cirrus mac80211 snd_hda_codec_generic ledtrig_audio kvm snd_hda_intel snd_intel_nhlt irqbypass snd_hda_codec btusb btrtl snd_hda_core ath btbcm ssb snd_hwdep btintel snd_seq crct10dif_pclmul iTCO_wdt snd_seq_device crc32_pclmul bluetooth mmc_core iTCO_vendor_support joydev cfg80211 [ 88.320602] applesmc ghash_clmulni_intel ecdh_generic snd_pcm input_polldev intel_cstate ecc intel_uncore thunderbolt snd_timer i2c_i801 libarc4 rfkill intel_rapl_perf lpc_ich mei_me pcspkr bcm5974 snd bcma mei soundcore acpi_als sbs kfifo_buf sbshc industrialio apple_bl i915 i2c_algo_bit drm_kms_helper drm uas crc32c_intel usb_storage video hid_apple [ 88.320630] CR2: 0000000000000018 [ 88.320633] ---[ end trace 933491c8a4fadeb7 ]--- [ 88.320642] RIP: 0010:hidpp_probe+0x61f/0x948 [hid_logitech_hidpp] [ 88.320645] Code: 81 00 00 48 89 ef e8 f0 d6 ff ff 41 89 c6 85 c0 75 b5 0f b6 44 24 28 48 8b 5d 00 88 44 24 1e 89 44 24 0c 48 8b 83 18 1c 00 00 <48> 8b 48 18 48 8b 83 10 19 00 00 48 8b 40 40 48 89 0c 24 0f b7 80 [ 88.320647] RSP: 0018:ffffb0a6824aba68 EFLAGS: 00010246 [ 88.320650] RAX: 0000000000000000 RBX: ffff93a50756e000 RCX: 0000000000010408 [ 88.320652] RDX: 0000000000000000 RSI: ffff93a51f0ad0a0 RDI: 000000000002d0a0 [ 88.320655] RBP: ffff93a50416da28 R08: ffff93a50416da70 R09: ffff93a50416da70 [ 88.320657] R10: 000000148ae9e60c R11: 00000000000f1525 R12: ffff93a50756e000 [ 88.320659] R13: ffff93a50756f8d0 R14: 0000000000000000 R15: ffff93a50756fc38 [ 88.320662] FS: 00007f8d8c1e0940(0000) GS:ffff93a51f080000(0000) knlGS:0000000000000000 [ 88.320664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.320667] CR2: 0000000000000018 CR3: 00000003996d8003 CR4: 00000000001606e0 To solve this issue: 1. Split g920_get_config() such that all of the device specific communication remains a part of the function and input subsystem initialization bits go to hidpp_ff_init() 2. Move call to hidpp_ff_init() from being a part of g920_get_config() to be the last step of .probe(), right after a call to hid_hw_start() with connect_mask containing HID_CONNECT_HIDINPUT. Fixes: 91cf9a9 ("HID: logitech-hidpp: make .probe usbhid capable") Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Tested-by: Sam Bazley <sambazley@fastmail.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: Henrik Rydberg <rydberg@bitmath.org> Cc: Pierre-Loup A. Griffais <pgriffais@valvesoftware.com> Cc: Austin Palmer <austinp@valvesoftware.com> Cc: linux-input@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 5.2+ Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>

Due to some hardware issues, queue 31 isn't usable on devices that have 32 queues (7000, 8000, 9000 families), which is correctly reflected in the configuration and TX queue initialization. However, the firmware API and queue allocation code assumes that there are 32 queues, and if something actually attempts to use #31 this leads to a NULL-pointer dereference since it's not allocated. Fix this by limiting to 31 in the IWL_MVM_DQA_MAX_DATA_QUEUE, and also add some code to catch this earlier in the future, if the configuration changes perhaps. Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/iwlwifi.20200417100405.98a79be2db6a.I3a4af6b03b87a6bc18db9b1ff9a812f397bee1fc@changeid

The following deadlock was captured. The first process is holding 'kernfs_mutex' and hung by io. The io was staging in 'r1conf.pending_bio_list' of raid1 device, this pending bio list would be flushed by second process 'md127_raid1', but it was hung by 'kernfs_mutex'. Using sysfs_notify_dirent_safe() to replace sysfs_notify() can fix it. There were other sysfs_notify() invoked from io path, removed all of them. PID: 40430 TASK: ffff8ee9c8c65c40 CPU: 29 COMMAND: "probe_file" #0 [ffffb87c4df37260] __schedule at ffffffff9a8678ec #1 [ffffb87c4df372f8] schedule at ffffffff9a867f06 #2 [ffffb87c4df37310] io_schedule at ffffffff9a0c73e6 #3 [ffffb87c4df37328] __dta___xfs_iunpin_wait_3443 at ffffffffc03a4057 [xfs] #4 [ffffb87c4df373a0] xfs_iunpin_wait at ffffffffc03a6c79 [xfs] #5 [ffffb87c4df373b0] __dta_xfs_reclaim_inode_3357 at ffffffffc039a46c [xfs] #6 [ffffb87c4df37400] xfs_reclaim_inodes_ag at ffffffffc039a8b6 [xfs] #7 [ffffb87c4df37590] xfs_reclaim_inodes_nr at ffffffffc039bb33 [xfs] #8 [ffffb87c4df375b0] xfs_fs_free_cached_objects at ffffffffc03af0e9 [xfs] #9 [ffffb87c4df375c0] super_cache_scan at ffffffff9a287ec7 #10 [ffffb87c4df37618] shrink_slab at ffffffff9a1efd93 #11 [ffffb87c4df37700] shrink_node at ffffffff9a1f5968 #12 [ffffb87c4df37788] do_try_to_free_pages at ffffffff9a1f5ea2 #13 [ffffb87c4df377f0] try_to_free_mem_cgroup_pages at ffffffff9a1f6445 #14 [ffffb87c4df37880] try_charge at ffffffff9a26cc5f #15 [ffffb87c4df37920] memcg_kmem_charge_memcg at ffffffff9a270f6a #16 [ffffb87c4df37958] new_slab at ffffffff9a251430 #17 [ffffb87c4df379c0] ___slab_alloc at ffffffff9a251c85 #18 [ffffb87c4df37a80] __slab_alloc at ffffffff9a25635d #19 [ffffb87c4df37ac0] kmem_cache_alloc at ffffffff9a251f89 #20 [ffffb87c4df37b00] alloc_inode at ffffffff9a2a2b10 #21 [ffffb87c4df37b20] iget_locked at ffffffff9a2a4854 #22 [ffffb87c4df37b60] kernfs_get_inode at ffffffff9a311377 #23 [ffffb87c4df37b80] kernfs_iop_lookup at ffffffff9a311e2b #24 [ffffb87c4df37ba8] lookup_slow at ffffffff9a290118 #25 [ffffb87c4df37c10] walk_component at ffffffff9a291e83 #26 [ffffb87c4df37c78] path_lookupat at ffffffff9a293619 #27 [ffffb87c4df37cd8] filename_lookup at ffffffff9a2953af #28 [ffffb87c4df37de8] user_path_at_empty at ffffffff9a295566 #29 [ffffb87c4df37e10] vfs_statx at ffffffff9a289787 #30 [ffffb87c4df37e70] SYSC_newlstat at ffffffff9a289d5d #31 [ffffb87c4df37f18] sys_newlstat at ffffffff9a28a60e #32 [ffffb87c4df37f28] do_syscall_64 at ffffffff9a003949 #33 [ffffb87c4df37f50] entry_SYSCALL_64_after_hwframe at ffffffff9aa001ad RIP: 00007f617a5f2905 RSP: 00007f607334f838 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 00007f6064044b20 RCX: 00007f617a5f2905 RDX: 00007f6064044b20 RSI: 00007f6064044b20 RDI: 00007f6064005890 RBP: 00007f6064044aa0 R8: 0000000000000030 R9: 000000000000011c R10: 0000000000000013 R11: 0000000000000246 R12: 00007f606417e6d0 R13: 00007f6064044aa0 R14: 00007f6064044b10 R15: 00000000ffffffff ORIG_RAX: 0000000000000006 CS: 0033 SS: 002b PID: 927 TASK: ffff8f15ac5dbd80 CPU: 42 COMMAND: "md127_raid1" #0 [ffffb87c4df07b28] __schedule at ffffffff9a8678ec #1 [ffffb87c4df07bc0] schedule at ffffffff9a867f06 #2 [ffffb87c4df07bd8] schedule_preempt_disabled at ffffffff9a86825e #3 [ffffb87c4df07be8] __mutex_lock at ffffffff9a869bcc #4 [ffffb87c4df07ca0] __mutex_lock_slowpath at ffffffff9a86a013 #5 [ffffb87c4df07cb0] mutex_lock at ffffffff9a86a04f #6 [ffffb87c4df07cc8] kernfs_find_and_get_ns at ffffffff9a311d83 #7 [ffffb87c4df07cf0] sysfs_notify at ffffffff9a314b3a #8 [ffffb87c4df07d18] md_update_sb at ffffffff9a688696 #9 [ffffb87c4df07d98] md_update_sb at ffffffff9a6886d5 #10 [ffffb87c4df07da8] md_check_recovery at ffffffff9a68ad9c #11 [ffffb87c4df07dd0] raid1d at ffffffffc01f0375 [raid1] #12 [ffffb87c4df07ea0] md_thread at ffffffff9a680348 #13 [ffffb87c4df07f08] kthread at ffffffff9a0b8005 #14 [ffffb87c4df07f50] ret_from_fork at ffffffff9aa00344 Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Song Liu <songliubraving@fb.com>

…_kfree The mhu_db_channel info is allocated per channel using devm_kzalloc from mhu_db_mbox_xlate which gets called from mbox_request_channel. However we are releasing the allocated mhu_db_channel info using plain kfree from mhu_db_shutdown which is called from mbox_free_channel. This leads to random crashes when the channel is freed like below one: Unable to handle kernel paging request at virtual address 0080000400000008 [0080000400000008] address between user and kernel address ranges Internal error: Oops: 96000044 [#1] PREEMPT SMP Modules linked in: scmi_module(-) CPU: 1 PID: 2212 Comm: rmmod Not tainted 5.10.0-rc5 #31 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Nov 19 2020 pstate: 20000085 (nzCv daIf -PAN -UAO -TCO BTYPE=--) pc : release_nodes+0x74/0x230 lr : devres_release_all+0x40/0x68 Call trace: release_nodes+0x74/0x230 devres_release_all+0x40/0x68 device_release_driver_internal+0x12c/0x1f8 driver_detach+0x58/0xe8 bus_remove_driver+0x64/0xe0 driver_unregister+0x38/0x68 platform_driver_unregister+0x1c/0x28 scmi_driver_exit+0x38/0x44 [scmi_module] __arm64_sys_delete_module+0x188/0x260 el0_svc_common.constprop.0+0x80/0x1a8 do_el0_svc+0x2c/0x98 el0_sync_handler+0x160/0x168 el0_sync+0x174/0x180 Code: 1400000d eb07009f 54000460 f9400486 (f90004a6) ---[ end trace c55ffd306c140233 ]--- Fix it by replacing kfree with devm_kfree as required. Fixes: 7002ca2 ("mailbox: arm_mhu: Add ARM MHU doorbell driver") Reported-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

…ext bug With lockdep enabled, we will get following warning: ar9331_switch ethernet.1:10 lan0 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:00] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13) BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 18, name: kworker/0:1 INFO: lockdep is turned off. irq event stamp: 602 hardirqs last enabled at (601): [<8073fde0>] _raw_spin_unlock_irq+0x3c/0x80 hardirqs last disabled at (602): [<8073a4f4>] __schedule+0x184/0x800 softirqs last enabled at (0): [<80080f60>] copy_process+0x578/0x14c8 softirqs last disabled at (0): [<00000000>] 0x0 CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.10.0-rc3-ar9331-00734-g7d644991df0c #31 Workqueue: events deferred_probe_work_func Stack : 80980000 80980000 8089ef70 80890000 804b5414 80980000 00000002 80b53728 00000000 800d1268 804b5414 ffffffde 00000017 800afe08 81943860 0f5bfc32 00000000 00000000 8089ef70 819436c0 ffffffea 00000000 00000000 00000000 8194390c 808e353c 0000000f 66657272 80980000 00000000 00000000 80890000 804b5414 80980000 00000002 80b53728 00000000 00000000 00000000 80d40000 ... Call Trace: [<80069ce0>] show_stack+0x9c/0x140 [<800afe08>] ___might_sleep+0x220/0x244 [<8073bfb0>] __mutex_lock+0x70/0x374 [<8073c2e0>] mutex_lock_nested+0x2c/0x38 [<804b5414>] regmap_update_bits_base+0x38/0x8c [<804ee584>] regmap_update_bits+0x1c/0x28 [<804ee714>] ar9331_sw_unmask_irq+0x34/0x60 [<800d91f0>] unmask_irq+0x48/0x70 [<800d93d4>] irq_startup+0x114/0x11c [<800d65b4>] __setup_irq+0x4f4/0x6d0 [<800d68a0>] request_threaded_irq+0x110/0x190 [<804e3ef0>] phy_request_interrupt+0x4c/0xe4 [<804df508>] phylink_bringup_phy+0x2c0/0x37c [<804df7bc>] phylink_of_phy_connect+0x118/0x130 [<806c1a64>] dsa_slave_create+0x3d0/0x578 [<806bc4ec>] dsa_register_switch+0x934/0xa20 [<804eef98>] ar9331_sw_probe+0x34c/0x364 [<804eb48c>] mdio_probe+0x44/0x70 [<8049e3b4>] really_probe+0x30c/0x4f4 [<8049ea10>] driver_probe_device+0x264/0x26c [<8049bc10>] bus_for_each_drv+0xb4/0xd8 [<8049e684>] __device_attach+0xe8/0x18c [<8049ce58>] bus_probe_device+0x48/0xc4 [<8049db70>] deferred_probe_work_func+0xdc/0xf8 [<8009ff64>] process_one_work+0x2e4/0x4a0 [<800a0770>] worker_thread+0x2a8/0x354 [<800a774c>] kthread+0x16c/0x174 [<8006306c>] ret_from_kernel_thread+0x14/0x1c ar9331_switch ethernet.1:10 lan1 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:02] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13) DSA: tree 0 setup To fix it, it is better to move access to MDIO register to the .irq_bus_sync_unlock call back. Fixes: ec6698c ("net: dsa: add support for Atheros AR9331 built-in switch") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20201211110317.17061-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Trent Piepho and others added 16 commits November 20, 2018 11:09

pull bot added the ⤵️ pull label Dec 1, 2018

pull bot merged commit 8805841 into lokeshbv:master Dec 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from torvalds:master #31

[pull] master from torvalds:master #31

pull bot commented Dec 1, 2018 •

edited

Loading

[pull] master from torvalds:master #31

[pull] master from torvalds:master #31

Conversation

pull bot commented Dec 1, 2018 • edited Loading

pull bot commented Dec 1, 2018 •

edited

Loading