-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ftgmac is still not detecting link if cable plugged in after BMC is booted #61
Comments
Did we issue below commands after step (8) ? By the way, who from IRC channel I can get more info?
|
|
No, when the cable is plugged, the network interface won't be up automatically. When MAC has NCSI enabled, we don't have a PHY attached. That means we don't have a thread monitoring the link status and wake up the MAC when link is ready. With NCSI, we have to issue "ifconifg ethx down && ifconfig eth0 up" after the cable is plugged. The problem here is the above command doesn't help after cable is plugged. |
"The problem here is the above command doesn't help after cable is plugged." NC-SI specifies an async event notification for things like link state change. Please see section 6.7 in: http://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.0.1.pdf. If that doesn't work, it is not unusual to poll for link status from what I understand. |
Yeah, you're correct that the AEN packet will be sent upon link-down event. The NCSI stack will try to choose another available channel if existing. Otherwise, the NCSI interface is down and I guess it's the case. When NCSI stack receives AEN on link-up event, it tries to choose an available channel for communication if we don't have one yet. I'll try to reproduce this issue on local BMC tomorrow or a bit later... |
What is the status of this? |
I don't have a chance to reproduce it locally. I will try it later of this week or early of next week. |
Customer reported same issue today. Any progress on recreating? |
Not yet, I'll continue on this next Monday when sitting around the BMC :-) |
I'm using "dev-4.3" branch as the base source code to debug the issue. Also the hardware platform is Palmetto. The experiment I did is as below and it should match with the experiment that James did.
One thing I observed from James saw is: Reconfiguring the network interface after plugging the network cable will bring the network back. I also added some code to print the AEN packet received on plugging or unplugging the network cable. However, I never see a AEN packet received. |
So what next? |
@gwshan it sounds like you are expecting to receive an AEN packet but aren't? Does the device driver rely on getting this packet? If so we should try and debug why you aren't getting them. |
Yes, I expect to receive AEN indicating link change on the active channel and I don't receive it when unplugging the ethernet cable. I added some printk() in the NCSI stack and print the received AEN packet. I don't see it. |
Do you get an AEN when the link drops? |
@gwshan Did you try wireshark to check the packet? |
@kenliu when the NCSI interface is down, no frames will be received on the In current design, the NCSI and other (like ARP/IP) packets are transmitted Another thing I'm not satisfied with current NCSI stack is: all available 2016-04-27 12:29 GMT+10:00 KenLiu notifications@github.com:
|
@nkskjames If we can get some info from Broadcom about the absence of that AEN message, that'd be helpful for us. |
Hi @gwshan and @jk-ozlabs |
Hi Ken, Thanks for checking with broadcom on this. It sounds the case. I However, the statement you had isn't always true according to the 2016-05-06 22:37 GMT+10:00 KenLiu notifications@github.com:
|
Thanks again for Ken's feedback from Broadcom. I added some code in the 2016-05-07 0:12 GMT+10:00 Shan Gavin shan.gavin@gmail.com:
|
A pachset sent to maillist for review. The issue is fixed by it. I had below testing with the patchset on palm4-bmc. Everything looks good:
|
I think this is resolved. Please re-open if it happens again. |
Race condition between registering an I2C device driver and deregistering an I2C adapter device which is assumed to manage that I2C device may lead to a NULL pointer dereference due to the uninitialized list head of driver clients. The root cause of the issue is that the I2C bus may know about the registered device driver and thus it is matched by bus_for_each_drv(), but the list of clients is not initialized and commonly it is NULL, because I2C device drivers define struct i2c_driver as static and clients field is expected to be initialized by I2C core: i2c_register_driver() i2c_del_adapter() driver_register() ... bus_add_driver() ... ... bus_for_each_drv(..., __process_removed_adapter) ... i2c_do_del_adapter() ... list_for_each_entry_safe(..., &driver->clients, ...) INIT_LIST_HEAD(&driver->clients); To solve the problem it is sufficient to do clients list head initialization before calling driver_register(). The problem was found while using an I2C device driver with a sluggish registration routine on a bus provided by a physically detachable I2C master controller, but practically the oops may be reproduced under the race between arbitraty I2C device driver registration and managing I2C bus device removal e.g. by unbinding the latter over sysfs: % echo 21a4000.i2c > /sys/bus/platform/drivers/imx-i2c/unbind Unable to handle kernel NULL pointer dereference at virtual address 00000000 Internal error: Oops: 17 [openbmc#1] SMP ARM CPU: 2 PID: 533 Comm: sh Not tainted 4.9.0-rc3+ openbmc#61 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) task: e5ada400 task.stack: e4936000 PC is at i2c_do_del_adapter+0x20/0xcc LR is at __process_removed_adapter+0x14/0x1c Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: 35bd004a DAC: 00000051 Process sh (pid: 533, stack limit = 0xe4936210) Stack: (0xe4937d28 to 0xe4938000) Backtrace: [<c0667be0>] (i2c_do_del_adapter) from [<c0667cc0>] (__process_removed_adapter+0x14/0x1c) [<c0667cac>] (__process_removed_adapter) from [<c0516998>] (bus_for_each_drv+0x6c/0xa0) [<c051692c>] (bus_for_each_drv) from [<c06685ec>] (i2c_del_adapter+0xbc/0x284) [<c0668530>] (i2c_del_adapter) from [<bf0110ec>] (i2c_imx_remove+0x44/0x164 [i2c_imx]) [<bf0110a8>] (i2c_imx_remove [i2c_imx]) from [<c051a838>] (platform_drv_remove+0x2c/0x44) [<c051a80c>] (platform_drv_remove) from [<c05183d8>] (__device_release_driver+0x90/0x12c) [<c0518348>] (__device_release_driver) from [<c051849c>] (device_release_driver+0x28/0x34) [<c0518474>] (device_release_driver) from [<c0517150>] (unbind_store+0x80/0x104) [<c05170d0>] (unbind_store) from [<c0516520>] (drv_attr_store+0x28/0x34) [<c05164f8>] (drv_attr_store) from [<c0298acc>] (sysfs_kf_write+0x50/0x54) [<c0298a7c>] (sysfs_kf_write) from [<c029801c>] (kernfs_fop_write+0x100/0x214) [<c0297f1c>] (kernfs_fop_write) from [<c0220130>] (__vfs_write+0x34/0x120) [<c02200fc>] (__vfs_write) from [<c0221088>] (vfs_write+0xa8/0x170) [<c0220fe0>] (vfs_write) from [<c0221e74>] (SyS_write+0x4c/0xa8) [<c0221e28>] (SyS_write) from [<c0108a20>] (ret_fast_syscall+0x0/0x1c) Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
commit 147b36d upstream. Race condition between registering an I2C device driver and deregistering an I2C adapter device which is assumed to manage that I2C device may lead to a NULL pointer dereference due to the uninitialized list head of driver clients. The root cause of the issue is that the I2C bus may know about the registered device driver and thus it is matched by bus_for_each_drv(), but the list of clients is not initialized and commonly it is NULL, because I2C device drivers define struct i2c_driver as static and clients field is expected to be initialized by I2C core: i2c_register_driver() i2c_del_adapter() driver_register() ... bus_add_driver() ... ... bus_for_each_drv(..., __process_removed_adapter) ... i2c_do_del_adapter() ... list_for_each_entry_safe(..., &driver->clients, ...) INIT_LIST_HEAD(&driver->clients); To solve the problem it is sufficient to do clients list head initialization before calling driver_register(). The problem was found while using an I2C device driver with a sluggish registration routine on a bus provided by a physically detachable I2C master controller, but practically the oops may be reproduced under the race between arbitraty I2C device driver registration and managing I2C bus device removal e.g. by unbinding the latter over sysfs: % echo 21a4000.i2c > /sys/bus/platform/drivers/imx-i2c/unbind Unable to handle kernel NULL pointer dereference at virtual address 00000000 Internal error: Oops: 17 [#1] SMP ARM CPU: 2 PID: 533 Comm: sh Not tainted 4.9.0-rc3+ #61 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) task: e5ada400 task.stack: e4936000 PC is at i2c_do_del_adapter+0x20/0xcc LR is at __process_removed_adapter+0x14/0x1c Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: 35bd004a DAC: 00000051 Process sh (pid: 533, stack limit = 0xe4936210) Stack: (0xe4937d28 to 0xe4938000) Backtrace: [<c0667be0>] (i2c_do_del_adapter) from [<c0667cc0>] (__process_removed_adapter+0x14/0x1c) [<c0667cac>] (__process_removed_adapter) from [<c0516998>] (bus_for_each_drv+0x6c/0xa0) [<c051692c>] (bus_for_each_drv) from [<c06685ec>] (i2c_del_adapter+0xbc/0x284) [<c0668530>] (i2c_del_adapter) from [<bf0110ec>] (i2c_imx_remove+0x44/0x164 [i2c_imx]) [<bf0110a8>] (i2c_imx_remove [i2c_imx]) from [<c051a838>] (platform_drv_remove+0x2c/0x44) [<c051a80c>] (platform_drv_remove) from [<c05183d8>] (__device_release_driver+0x90/0x12c) [<c0518348>] (__device_release_driver) from [<c051849c>] (device_release_driver+0x28/0x34) [<c0518474>] (device_release_driver) from [<c0517150>] (unbind_store+0x80/0x104) [<c05170d0>] (unbind_store) from [<c0516520>] (drv_attr_store+0x28/0x34) [<c05164f8>] (drv_attr_store) from [<c0298acc>] (sysfs_kf_write+0x50/0x54) [<c0298a7c>] (sysfs_kf_write) from [<c029801c>] (kernfs_fop_write+0x100/0x214) [<c0297f1c>] (kernfs_fop_write) from [<c0220130>] (__vfs_write+0x34/0x120) [<c02200fc>] (__vfs_write) from [<c0221088>] (vfs_write+0xa8/0x170) [<c0220fe0>] (vfs_write) from [<c0221e74>] (SyS_write+0x4c/0xa8) [<c0221e28>] (SyS_write) from [<c0108a20>] (ret_fast_syscall+0x0/0x1c) Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the controller supports the Read LE Resolv List size feature, the maximum list size are read and now stored. Before patch: < HCI Command: LE Read White List... (0x08|0x000f) plen 0 openbmc#55 [hci0] 17.979791 > HCI Event: Command Complete (0x0e) plen 5 openbmc#56 [hci0] 17.980629 LE Read White List Size (0x08|0x000f) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear White List (0x08|0x0010) plen 0 openbmc#57 [hci0] 17.980786 > HCI Event: Command Complete (0x0e) plen 4 openbmc#58 [hci0] 17.981627 LE Clear White List (0x08|0x0010) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Maximum Dat.. (0x08|0x002f) plen 0 openbmc#59 [hci0] 17.981786 > HCI Event: Command Complete (0x0e) plen 12 openbmc#60 [hci0] 17.982636 LE Read Maximum Data Length (0x08|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 17040 Max RX octets: 251 Max RX time: 17040 After patch: < HCI Command: LE Read White List... (0x08|0x000f) plen 0 openbmc#55 [hci0] 13.338168 > HCI Event: Command Complete (0x0e) plen 5 openbmc#56 [hci0] 13.338842 LE Read White List Size (0x08|0x000f) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear White List (0x08|0x0010) plen 0 openbmc#57 [hci0] 13.339029 > HCI Event: Command Complete (0x0e) plen 4 openbmc#58 [hci0] 13.339939 LE Clear White List (0x08|0x0010) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Resolving L.. (0x08|0x002a) plen 0 openbmc#59 [hci0] 13.340152 > HCI Event: Command Complete (0x0e) plen 5 openbmc#60 [hci0] 13.340952 LE Read Resolving List Size (0x08|0x002a) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Read Maximum Dat.. (0x08|0x002f) plen 0 openbmc#61 [hci0] 13.341180 > HCI Event: Command Complete (0x0e) plen 12 openbmc#62 [hci0] 13.341898 LE Read Maximum Data Length (0x08|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 17040 Max RX octets: 251 Max RX time: 17040 Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Check for Resolv list supported by controller. So check the supported commmand first before issuing this command i.e.,HCI_OP_LE_CLEAR_RESOLV_LIST Before patch: < HCI Command: LE Read White List... (0x08|0x000f) plen 0 openbmc#55 [hci0] 13.338168 > HCI Event: Command Complete (0x0e) plen 5 openbmc#56 [hci0] 13.338842 LE Read White List Size (0x08|0x000f) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear White List (0x08|0x0010) plen 0 openbmc#57 [hci0] 13.339029 > HCI Event: Command Complete (0x0e) plen 4 openbmc#58 [hci0] 13.339939 LE Clear White List (0x08|0x0010) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Resolving L.. (0x08|0x002a) plen 0 openbmc#59 [hci0] 13.340152 > HCI Event: Command Complete (0x0e) plen 5 openbmc#60 [hci0] 13.340952 LE Read Resolving List Size (0x08|0x002a) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Read Maximum Dat.. (0x08|0x002f) plen 0 openbmc#61 [hci0] 13.341180 > HCI Event: Command Complete (0x0e) plen 12 openbmc#62 [hci0] 13.341898 LE Read Maximum Data Length (0x08|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 17040 Max RX octets: 251 Max RX time: 17040 After patch: < HCI Command: LE Read White List... (0x08|0x000f) plen 0 openbmc#55 [hci0] 28.919131 > HCI Event: Command Complete (0x0e) plen 5 openbmc#56 [hci0] 28.920016 LE Read White List Size (0x08|0x000f) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear White List (0x08|0x0010) plen 0 openbmc#57 [hci0] 28.920164 > HCI Event: Command Complete (0x0e) plen 4 openbmc#58 [hci0] 28.920873 LE Clear White List (0x08|0x0010) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Resolving L.. (0x08|0x002a) plen 0 openbmc#59 [hci0] 28.921109 > HCI Event: Command Complete (0x0e) plen 5 openbmc#60 [hci0] 28.922016 LE Read Resolving List Size (0x08|0x002a) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear Resolving... (0x08|0x0029) plen 0 openbmc#61 [hci0] 28.922166 > HCI Event: Command Complete (0x0e) plen 4 openbmc#62 [hci0] 28.922872 LE Clear Resolving List (0x08|0x0029) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Maximum Dat.. (0x08|0x002f) plen 0 openbmc#63 [hci0] 28.923117 > HCI Event: Command Complete (0x0e) plen 12 openbmc#64 [hci0] 28.924030 LE Read Maximum Data Length (0x08|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 17040 Max RX octets: 251 Max RX time: 17040 Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When calling kmalloc with GFP_KERNEL in case CONFIG_SLOB is unset, kmem_cache_alloc_trace is called. In case CONFIG_TRACING is set, kmem_cache_alloc_trace will ball slab_alloc, which will call slab_pre_alloc_hook which might_sleep_if. The context in which it is called in this case, the intel_sst_interrupt_mrfld, calling a sleeping kmalloc generates a BUG(): Fixes: 972b0d4 ("ASoC: Intel: remove GFP_ATOMIC, use GFP_KERNEL") [ 20.250671] BUG: sleeping function called from invalid context at mm/slab.h:422 [ 20.250683] in_atomic(): 1, irqs_disabled(): 1, pid: 1791, name: Chrome_IOThread [ 20.250690] CPU: 0 PID: 1791 Comm: Chrome_IOThread Tainted: G W 4.19.43 openbmc#61 [ 20.250693] Hardware name: GOOGLE Kefka, BIOS Google_Kefka.7287.337.0 03/02/2017 [ 20.250697] Call Trace: [ 20.250704] <IRQ> [ 20.250716] dump_stack+0x7e/0xc3 [ 20.250725] ___might_sleep+0x12a/0x140 [ 20.250731] kmem_cache_alloc_trace+0x53/0x1c5 [ 20.250736] ? update_cfs_rq_load_avg+0x17e/0x1aa [ 20.250740] ? cpu_load_update+0x6c/0xc2 [ 20.250746] sst_create_ipc_msg+0x2d/0x88 [ 20.250752] intel_sst_interrupt_mrfld+0x12a/0x22c [ 20.250758] __handle_irq_event_percpu+0x133/0x228 [ 20.250764] handle_irq_event_percpu+0x35/0x7a [ 20.250768] handle_irq_event+0x36/0x55 [ 20.250773] handle_fasteoi_irq+0xab/0x16c [ 20.250779] handle_irq+0xd9/0x11e [ 20.250785] do_IRQ+0x54/0xe0 [ 20.250791] common_interrupt+0xf/0xf [ 20.250795] </IRQ> [ 20.250800] RIP: 0010:__lru_cache_add+0x4e/0xad [ 20.250806] Code: 00 01 48 c7 c7 b8 df 01 00 65 48 03 3c 25 28 f1 00 00 48 8b 48 08 48 89 ca 48 ff ca f6 c1 01 48 0f 44 d0 f0 ff 42 34 0f b6 0f <89> ca fe c2 88 17 48 89 44 cf 08 80 fa 0f 74 0e 48 8b 08 66 85 c9 [ 20.250809] RSP: 0000:ffffa568810bfd98 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffd6 [ 20.250814] RAX: ffffd3b904eb1940 RBX: ffffd3b904eb1940 RCX: 0000000000000004 [ 20.250817] RDX: ffffd3b904eb1940 RSI: ffffa10ee5c47450 RDI: ffffa10efba1dfb8 [ 20.250821] RBP: ffffa568810bfda8 R08: ffffa10ef9c741c1 R09: dead000000000100 [ 20.250824] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa10ee8d52a40 [ 20.250827] R13: ffffa10ee8d52000 R14: ffffa10ee5c47450 R15: 800000013ac65067 [ 20.250835] lru_cache_add_active_or_unevictable+0x4e/0xb8 [ 20.250841] handle_mm_fault+0xd98/0x10c4 [ 20.250848] __do_page_fault+0x235/0x42d [ 20.250853] ? page_fault+0x8/0x30 [ 20.250858] do_page_fault+0x3d/0x17a [ 20.250862] ? page_fault+0x8/0x30 [ 20.250866] page_fault+0x1e/0x30 [ 20.250872] RIP: 0033:0x7962fdea9304 [ 20.250875] Code: 0f 11 4c 17 f0 c3 48 3b 15 f1 26 31 00 0f 83 e2 00 00 00 48 39 f7 72 0f 74 12 4c 8d 0c 16 4c 39 cf 0f 82 63 01 00 00 48 89 d1 <f3> a4 c3 80 fa 08 73 12 80 fa 04 73 1e 80 fa 01 77 26 72 05 0f b6 [ 20.250879] RSP: 002b:00007962f4db5468 EFLAGS: 00010206 [ 20.250883] RAX: 00003c8cc9d47008 RBX: 0000000000000000 RCX: 0000000000001b48 [ 20.250886] RDX: 0000000000002b40 RSI: 00003c8cc9551000 RDI: 00003c8cc9d48000 [ 20.250890] RBP: 00007962f4db5820 R08: 0000000000000000 R09: 00003c8cc9552b48 [ 20.250893] R10: 0000562dd1064d30 R11: 00003c8cc825b908 R12: 00003c8cc966d3c0 [ 20.250896] R13: 00003c8cc9e280c0 R14: 0000000000000000 R15: 0000000000000000 Signed-off-by: Alex Levin <levinale@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org>
…y section commit 8068df3 upstream. When we remove an early section, we don't free the usage map, as the usage maps of other sections are placed into the same page. Once the section is removed, it is no longer an early section (especially, the memmap is freed). When we re-add that section, the usage map is reused, however, it is no longer an early section. When removing that section again, we try to kfree() a usage map that was allocated during early boot - bad. Let's check against PageReserved() to see if we are dealing with an usage map that was allocated during boot. We could also check against !(PageSlab(usage_page) || PageCompound(usage_page)), but PageReserved() is cleaner. Can be triggered using memtrace under ppc64/powernv: $ mount -t debugfs none /sys/kernel/debug/ $ echo 0x20000000 > /sys/kernel/debug/powerpc/memtrace/enable $ echo 0x20000000 > /sys/kernel/debug/powerpc/memtrace/enable ------------[ cut here ]------------ kernel BUG at mm/slub.c:3969! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=3D64K MMU=3DHash SMP NR_CPUS=3D2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 154 Comm: sh Not tainted 5.5.0-rc2-next-20191216-00005-g0be1dba7b7c0 #61 NIP kfree+0x338/0x3b0 LR section_deactivate+0x138/0x200 Call Trace: section_deactivate+0x138/0x200 __remove_pages+0x114/0x150 arch_remove_memory+0x3c/0x160 try_remove_memory+0x114/0x1a0 __remove_memory+0x20/0x40 memtrace_enable_set+0x254/0x850 simple_attr_write+0x138/0x160 full_proxy_write+0x8c/0x110 __vfs_write+0x38/0x70 vfs_write+0x11c/0x2a0 ksys_write+0x84/0x140 system_call+0x5c/0x68 ---[ end trace 4b053cbd84e0db62 ]--- The first invocation will offline+remove memory blocks. The second invocation will first add+online them again, in order to offline+remove them again (usually we are lucky and the exact same memory blocks will get "reallocated"). Tested on powernv with boot memory: The usage map will not get freed. Tested on x86-64 with DIMMs: The usage map will get freed. Using Dynamic Memory under a Power DLAPR can trigger it easily. Triggering removal (I assume after previously removed+re-added) of memory from the HMC GUI can crash the kernel with the same call trace and is fixed by this patch. Link: http://lkml.kernel.org/r/20191217104637.5509-1-david@redhat.com Fixes: 326e1b8 ("mm/sparsemem: introduce a SECTION_IS_EARLY flag") Signed-off-by: David Hildenbrand <david@redhat.com> Tested-by: Pingfan Liu <piliu@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7733306 upstream. The "inline" keyword is a hint for the compiler to inline a function. The functions system_uses_irq_prio_masking() and gic_write_pmr() are used by the code running at EL2 on a non-VHE system, so mark them as __always_inline to make sure they'll always be part of the .hyp.text section. This fixes the following splat when trying to run a VM: [ 47.625273] Kernel panic - not syncing: HYP panic: [ 47.625273] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006 [ 47.625273] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000 [ 47.625273] VCPU:0000000000000000 [ 47.647261] CPU: 1 PID: 217 Comm: kvm-vcpu-0 Not tainted 5.8.0-rc1-ARCH+ #61 [ 47.654508] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT) [ 47.661139] Call trace: [ 47.663659] dump_backtrace+0x0/0x1cc [ 47.667413] show_stack+0x18/0x24 [ 47.670822] dump_stack+0xb8/0x108 [ 47.674312] panic+0x124/0x2f4 [ 47.677446] panic+0x0/0x2f4 [ 47.680407] SMP: stopping secondary CPUs [ 47.684439] Kernel Offset: disabled [ 47.688018] CPU features: 0x240402,20002008 [ 47.692318] Memory Limit: none [ 47.695465] ---[ end Kernel panic - not syncing: HYP panic: [ 47.695465] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006 [ 47.695465] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000 [ 47.695465] VCPU:0000000000000000 ]--- The instruction abort was caused by the code running at EL2 trying to fetch an instruction which wasn't mapped in the EL2 translation tables. Using objdump showed the two functions as separate symbols in the .text section. Fixes: 85738e0 ("arm64: kvm: Unmask PMR before entering guest") Cc: stable@vger.kernel.org Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200618171254.1596055-1-alexandru.elisei@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7733306 upstream. The "inline" keyword is a hint for the compiler to inline a function. The functions system_uses_irq_prio_masking() and gic_write_pmr() are used by the code running at EL2 on a non-VHE system, so mark them as __always_inline to make sure they'll always be part of the .hyp.text section. This fixes the following splat when trying to run a VM: [ 47.625273] Kernel panic - not syncing: HYP panic: [ 47.625273] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006 [ 47.625273] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000 [ 47.625273] VCPU:0000000000000000 [ 47.647261] CPU: 1 PID: 217 Comm: kvm-vcpu-0 Not tainted 5.8.0-rc1-ARCH+ #61 [ 47.654508] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT) [ 47.661139] Call trace: [ 47.663659] dump_backtrace+0x0/0x1cc [ 47.667413] show_stack+0x18/0x24 [ 47.670822] dump_stack+0xb8/0x108 [ 47.674312] panic+0x124/0x2f4 [ 47.677446] panic+0x0/0x2f4 [ 47.680407] SMP: stopping secondary CPUs [ 47.684439] Kernel Offset: disabled [ 47.688018] CPU features: 0x240402,20002008 [ 47.692318] Memory Limit: none [ 47.695465] ---[ end Kernel panic - not syncing: HYP panic: [ 47.695465] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006 [ 47.695465] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000 [ 47.695465] VCPU:0000000000000000 ]--- The instruction abort was caused by the code running at EL2 trying to fetch an instruction which wasn't mapped in the EL2 translation tables. Using objdump showed the two functions as separate symbols in the .text section. Fixes: 85738e0 ("arm64: kvm: Unmask PMR before entering guest") Cc: stable@vger.kernel.org Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200618171254.1596055-1-alexandru.elisei@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 704adfb upstream. The histogram logic was allowing events with char * pointers to be used as normal strings. But it was easy to crash the kernel with: # echo 'hist:keys=filename' > events/syscalls/sys_enter_openat/trigger And open some files, and boom! BUG: unable to handle page fault for address: 00007f2ced0c3280 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1173fa067 P4D 1173fa067 PUD 1171b6067 PMD 1171dd067 PTE 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 1810 Comm: cat Not tainted 5.13.0-rc5-test+ #61 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:strlen+0x0/0x20 Code: f6 82 80 2a 0b a9 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2a 0b a9 20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 RSP: 0018:ffffbdbf81567b50 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffff93815cdb3800 RCX: ffff9382401a22d0 RDX: 0000000000000100 RSI: 0000000000000000 RDI: 00007f2ced0c3280 RBP: 0000000000000100 R08: ffff9382409ff074 R09: ffffbdbf81567c98 R10: ffff9382409ff074 R11: 0000000000000000 R12: ffff9382409ff074 R13: 0000000000000001 R14: ffff93815a744f00 R15: 00007f2ced0c3280 FS: 00007f2ced0f8580(0000) GS:ffff93825a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2ced0c3280 CR3: 0000000107069005 CR4: 00000000001706e0 Call Trace: event_hist_trigger+0x463/0x5f0 ? find_held_lock+0x32/0x90 ? sched_clock_cpu+0xe/0xd0 ? lock_release+0x155/0x440 ? kernel_init_free_pages+0x6d/0x90 ? preempt_count_sub+0x9b/0xd0 ? kernel_init_free_pages+0x6d/0x90 ? get_page_from_freelist+0x12c4/0x1680 ? __rb_reserve_next+0xe5/0x460 ? ring_buffer_lock_reserve+0x12a/0x3f0 event_triggers_call+0x52/0xe0 ftrace_syscall_enter+0x264/0x2c0 syscall_trace_enter.constprop.0+0x1ee/0x210 do_syscall_64+0x1c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Where it triggered a fault on strlen(key) where key was the filename. The reason is that filename is a char * to user space, and the histogram code just blindly dereferenced it, with obvious bad results. I originally tried to use strncpy_from_user/kernel_nofault() but found that there's other places that its dereferenced and not worth the effort. Just do not allow "char *" to act like strings. Link: https://lkml.kernel.org/r/20210715000206.025df9d2@rorschach.local.home Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com> Cc: stable@vger.kernel.org Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Tom Zanussi <zanussi@kernel.org> Fixes: 79e577c ("tracing: Support string type key properly") Fixes: 5967bd5 ("tracing: Let filter_assign_type() detect FILTER_PTR_STRING") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here are steps to fail:
[network works fine]
The text was updated successfully, but these errors were encountered: