-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Intel-SIG] [Meteor Lake] Sync 6.6 EDAC driver patches to Deepin 6.6 #79
Merged
matrix-wsk
merged 9 commits into
deepin-community:linux-6.6.y
from
shiqingd:sandbox/shiqingd/deepin/6.6/edac
Feb 23, 2024
Merged
[Intel-SIG] [Meteor Lake] Sync 6.6 EDAC driver patches to Deepin 6.6 #79
matrix-wsk
merged 9 commits into
deepin-community:linux-6.6.y
from
shiqingd:sandbox/shiqingd/deepin/6.6/edac
Feb 23, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Errors of some I/O devices can be signaled by MCE and logged in IOMCA bank. Add MCACOD code of generic I/O error and related macros for MCi_MISC to support IOMCA logging. See Intel Software Developers' Manual, version 071, volume 3B, section "IOMCA". Intel-SIG: x86/mce: Add MCACOD code for generic I/O error. intel/linux-intel-lts@1afcf245df31 Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
Integrated Error Handlers (IEHs) are PCIe devices which aggregate and report error events of different severities (correctable, non-fatal uncorrectable, and fatal uncorrectable) from various I/O devices, e.g., PCIe devices, legacy PCI devices. Each error severity is notified by one of {SMI, NMI, MCE} which is configured by BIOS/platform firmware. The first IEH-supported platform is Intel Tiger Lake-U CPU. The driver reads/prints the error severity and error source (bus/device/function) logged in the IEH(s) and restarts the system on fatal I/O device error. Intel-SIG: EDAC/ieh: Add I/O device EDAC driver for Intel CPUs with IEH. intel/linux-intel-lts@9f721ed88580 Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
Tiger Lake-H SoC shares the same Integrated Error Handler(IEH) architecture with Tiger Lake-U, so can use the same ieh_edac driver. Add Tiger Lake-H IEH device ID for I/O device EDAC support. Intel-SIG: EDAC/ieh: Add I/O device EDAC support for Intel Tiger Lake-H SoC. intel/linux-intel-lts@39b1cbd8dc70 Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
The igen6_edac driver is the root to capture the In-Band ECC error event. There are some external modules which want to be notified about the In-Band ECC errors for specific error handling. So add the registration APIs for those external modules for the In-Band ECC errors. Intel-SIG: EDAC/igen6: Add registration APIs for In-Band ECC error notification. intel/linux-intel-lts@da1536bdff36 Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
Mainline: commit c4a5398("EDAC/igen6: Add Intel Alder Lake-N SoCs support") from: v6.8-rc1 Add several Intel Alder Lake-N SoC compute die IDs for EDAC support. Alder Lake-N (one memory controller) is a cut-down derivative of Alder Lake-P (two memory controllers). Intel-SIG: Upstream commit c4a5398 ("EDAC/igen6: Add Intel Alder Lake-N SoCs support"). Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
Mainline: commit d23627a("EDAC/igen6: Add Intel Raptor Lake-P SoCs support") from: v6.8-rc1 Add several Intel Raptor Lake-P SoC compute die IDs for EDAC support. These Raptor Lake-P SoCs use similar memory controller and IBECC as Alder Lake-P SoC but extend the most significant bit of error address logged in IBECC from bit 38 to bit 45. Intel-SIG: Upstream commit d23627a ("EDAC/igen6: Add Intel Raptor Lake-P SoCs support"). upstream link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d23627a7688f&dt=2 Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
Mainline: commit a264f71 ("EDAC/igen6: Make get_mchbar() helper function") from: v6.8-rc1 Make get_mchbar() helper function to retrieve the BAR address of the memory controller. No function changes. Intel-SIG: Upstream commit a264f71 ("EDAC/igen6: Make get_mchbar() helper function"). Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
Mainline: commit 3c77090 ("EDAC/igen6: Add Intel Meteor Lake-PS SoCs support") from: v6.8-rc1 Add several Intel Meteor Lake-PS SoC compute die IDs for EDAC support. These Meteor Lake-PS SoCs use similar memory controller and IBECC as Alder Lake-P SoC. Intel-SIG: Upstream commit 3c77090 ("EDAC/igen6: Add Intel Meteor Lake-PS SoCs support"). Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
Mainline: commit 6807434 ("EDAC/igen6: Add Intel Meteor Lake-P SoCs support") from: v6.8-rc1 Add several Intel Meteor Lake-P SoC compute die IDs for EDAC support. These Meteor Lake-P SoCs use similar memory controller and IBECC as Alder Lake-P SoC. Intel-SIG: Upstream commit 6807434 ("EDAC/igen6: Add Intel Meteor Lake-P SoCs support"). Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> [ Qingdong Shi: amend commit log ] Signed-off-by: Qingdong Shi <qingdong.shi@intel.com>
opsiff
pushed a commit
to opsiff/UOS-kernel
that referenced
this pull request
Jul 30, 2024
commit 667574e873b5f77a220b2a93329689f36fb56d5d upstream. When tries to demote 1G hugetlb folios, a lockdep warning is observed: ============================================ WARNING: possible recursive locking detected 6.10.0-rc6-00452-ga4d0275fa660-dirty deepin-community#79 Not tainted -------------------------------------------- bash/710 is trying to acquire lock: ffffffff8f0a7850 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0x244/0x460 but task is already holding lock: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&h->resize_lock); lock(&h->resize_lock); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by bash/710: #0: ffff8f118439c3f0 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x64/0xe0 deepin-community#1: ffff8f11893b9e88 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf8/0x1d0 deepin-community#2: ffff8f1183dc4428 (kn->active#98){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x100/0x1d0 deepin-community#3: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 stack backtrace: CPU: 3 PID: 710 Comm: bash Not tainted 6.10.0-rc6-00452-ga4d0275fa660-dirty deepin-community#79 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x68/0xa0 __lock_acquire+0x10f2/0x1ca0 lock_acquire+0xbe/0x2d0 __mutex_lock+0x6d/0x400 demote_store+0x244/0x460 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa61db14887 RSP: 002b:00007ffc56c48358 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa61db14887 RDX: 0000000000000002 RSI: 000055a030050220 RDI: 0000000000000001 RBP: 000055a030050220 R08: 00007fa61dbd1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007fa61dc1b780 R14: 00007fa61dc17600 R15: 00007fa61dc16a00 </TASK> Lockdep considers this an AA deadlock because the different resize_lock mutexes reside in the same lockdep class, but this is a false positive. Place them in distinct classes to avoid these warnings. Link: https://lkml.kernel.org/r/20240712031314.2570452-1-linmiaohe@huawei.com Fixes: 8531fc6 ("hugetlb: add hugetlb demote page support") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit b10c359e644235bc1c1899981d7203cbf5abf443)
opsiff
pushed a commit
to opsiff/UOS-kernel
that referenced
this pull request
Aug 1, 2024
commit 667574e873b5f77a220b2a93329689f36fb56d5d upstream. When tries to demote 1G hugetlb folios, a lockdep warning is observed: ============================================ WARNING: possible recursive locking detected 6.10.0-rc6-00452-ga4d0275fa660-dirty deepin-community#79 Not tainted -------------------------------------------- bash/710 is trying to acquire lock: ffffffff8f0a7850 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0x244/0x460 but task is already holding lock: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&h->resize_lock); lock(&h->resize_lock); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by bash/710: #0: ffff8f118439c3f0 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x64/0xe0 deepin-community#1: ffff8f11893b9e88 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf8/0x1d0 deepin-community#2: ffff8f1183dc4428 (kn->active#98){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x100/0x1d0 deepin-community#3: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 stack backtrace: CPU: 3 PID: 710 Comm: bash Not tainted 6.10.0-rc6-00452-ga4d0275fa660-dirty deepin-community#79 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x68/0xa0 __lock_acquire+0x10f2/0x1ca0 lock_acquire+0xbe/0x2d0 __mutex_lock+0x6d/0x400 demote_store+0x244/0x460 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa61db14887 RSP: 002b:00007ffc56c48358 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa61db14887 RDX: 0000000000000002 RSI: 000055a030050220 RDI: 0000000000000001 RBP: 000055a030050220 R08: 00007fa61dbd1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007fa61dc1b780 R14: 00007fa61dc17600 R15: 00007fa61dc16a00 </TASK> Lockdep considers this an AA deadlock because the different resize_lock mutexes reside in the same lockdep class, but this is a false positive. Place them in distinct classes to avoid these warnings. Link: https://lkml.kernel.org/r/20240712031314.2570452-1-linmiaohe@huawei.com Fixes: 8531fc6 ("hugetlb: add hugetlb demote page support") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 60c223ed1f6f47ad36da6b4032e3b45cad8fb41f)
opsiff
pushed a commit
to opsiff/UOS-kernel
that referenced
this pull request
Aug 4, 2024
commit 667574e873b5f77a220b2a93329689f36fb56d5d upstream. When tries to demote 1G hugetlb folios, a lockdep warning is observed: ============================================ WARNING: possible recursive locking detected 6.10.0-rc6-00452-ga4d0275fa660-dirty deepin-community#79 Not tainted -------------------------------------------- bash/710 is trying to acquire lock: ffffffff8f0a7850 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0x244/0x460 but task is already holding lock: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&h->resize_lock); lock(&h->resize_lock); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by bash/710: #0: ffff8f118439c3f0 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x64/0xe0 deepin-community#1: ffff8f11893b9e88 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf8/0x1d0 deepin-community#2: ffff8f1183dc4428 (kn->active#98){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x100/0x1d0 deepin-community#3: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 stack backtrace: CPU: 3 PID: 710 Comm: bash Not tainted 6.10.0-rc6-00452-ga4d0275fa660-dirty deepin-community#79 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x68/0xa0 __lock_acquire+0x10f2/0x1ca0 lock_acquire+0xbe/0x2d0 __mutex_lock+0x6d/0x400 demote_store+0x244/0x460 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa61db14887 RSP: 002b:00007ffc56c48358 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa61db14887 RDX: 0000000000000002 RSI: 000055a030050220 RDI: 0000000000000001 RBP: 000055a030050220 R08: 00007fa61dbd1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007fa61dc1b780 R14: 00007fa61dc17600 R15: 00007fa61dc16a00 </TASK> Lockdep considers this an AA deadlock because the different resize_lock mutexes reside in the same lockdep class, but this is a false positive. Place them in distinct classes to avoid these warnings. Link: https://lkml.kernel.org/r/20240712031314.2570452-1-linmiaohe@huawei.com Fixes: 8531fc6 ("hugetlb: add hugetlb demote page support") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 99a49b670ede46ad6f47eef2c93bb31f7db8dfd5)
Avenger-285714
pushed a commit
that referenced
this pull request
Aug 12, 2024
commit 667574e873b5f77a220b2a93329689f36fb56d5d upstream. When tries to demote 1G hugetlb folios, a lockdep warning is observed: ============================================ WARNING: possible recursive locking detected 6.10.0-rc6-00452-ga4d0275fa660-dirty #79 Not tainted -------------------------------------------- bash/710 is trying to acquire lock: ffffffff8f0a7850 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0x244/0x460 but task is already holding lock: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&h->resize_lock); lock(&h->resize_lock); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by bash/710: #0: ffff8f118439c3f0 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x64/0xe0 #1: ffff8f11893b9e88 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf8/0x1d0 #2: ffff8f1183dc4428 (kn->active#98){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x100/0x1d0 #3: ffffffff8f0a6f48 (&h->resize_lock){+.+.}-{3:3}, at: demote_store+0xae/0x460 stack backtrace: CPU: 3 PID: 710 Comm: bash Not tainted 6.10.0-rc6-00452-ga4d0275fa660-dirty #79 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x68/0xa0 __lock_acquire+0x10f2/0x1ca0 lock_acquire+0xbe/0x2d0 __mutex_lock+0x6d/0x400 demote_store+0x244/0x460 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa61db14887 RSP: 002b:00007ffc56c48358 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa61db14887 RDX: 0000000000000002 RSI: 000055a030050220 RDI: 0000000000000001 RBP: 000055a030050220 R08: 00007fa61dbd1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007fa61dc1b780 R14: 00007fa61dc17600 R15: 00007fa61dc16a00 </TASK> Lockdep considers this an AA deadlock because the different resize_lock mutexes reside in the same lockdep class, but this is a false positive. Place them in distinct classes to avoid these warnings. Link: https://lkml.kernel.org/r/20240712031314.2570452-1-linmiaohe@huawei.com Fixes: 8531fc6 ("hugetlb: add hugetlb demote page support") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 99a49b670ede46ad6f47eef2c93bb31f7db8dfd5)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR include the 9 patches which are used for EDAC/igen6 Intel Meteor Lake-P SoCs support.
Five patches have already been upstreamed to Linux kernel community.
Upstream:
commit 7f5b45e
6807434 ("EDAC/igen6: Add Intel Meteor Lake-P SoCs support")
commit 4681a10
3c77090 ("EDAC/igen6: Add Intel Meteor Lake-PS SoCs support").
commit 72597e7
a264f71 ("EDAC/igen6: Make get_mchbar() helper function").
commit cfa3e9a
d23627a ("EDAC/igen6: Add Intel Raptor Lake-P SoCs support").
commit 8f077b8
c4a5398 ("EDAC/igen6: Add Intel Alder Lake-N SoCs support").
Four Patches have already been uploaded to github/linux-intel-lts public repo.
commit be12908
intel/linux-intel-lts@da1536bdff36 ("EDAC/igen6: Add registration APIs for In-Band ECC error notification")
commit 1673f29
intel/linux-intel-lts@39b1cbd8dc70 ("EDAC/ieh: Add I/O device EDAC support for Intel Tiger Lake-H SoC")
commit 15ecf04
intel/linux-intel-lts@9f721ed88580 ("EDAC/ieh: Add I/O device EDAC driver for Intel CPUs with IEH")
commit 22dbb4d
intel/linux-intel-lts@1afcf245df31 ("x86/mce: Add MCACOD code for generic I/O error")