-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add kernel coredump and analysis on sonic kernel #3276
Conversation
can you describe the test you have done? |
yes, 256M is correct, I cherry-pick from dev-master, let me check why this happened
------------------------------------------------------------------
From:Jipan Yang <notifications@github.com>
Sent At:2019 Aug. 2 (Fri.) 10:44
To:Azure/sonic-buildimage <sonic-buildimage@noreply.github.com>
Cc:Siyuan <siyuan.sun@alibaba-inc.com>; Author <author@noreply.github.com>
Subject:Re: [Azure/sonic-buildimage] apply kdump supported package and config on fsroot (#3276)
@jipanyang commented on this pull request.
In files/image_config/platform/rc.local:
@@ -318,6 +318,11 @@ if [ -f $FIRST_BOOT_FILE ]; then
# Initialize the SONiC's grub config
mv /host/grub.cfg /host/grub/grub.cfg
fi
+ sed -i 's/[^M] quiet/ crashkernel=768M quiet/' /host/grub/grub.cfg
Please check whether 768M is absolutely necessary, 256M could be more reasonable?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you describe the test you have done?
just add test section to commit comments
Can you please share console output after you issued commands
Thanks |
sure, please see below: [1568872.292681] sysrq: SysRq : Trigger a crash |
@sun-siyuan Thanks |
yes, reboot is the default action after crash. if we need to keep the data flow running temporarily without reboot, which could be done as well.
…------------------------------------------------------------------
From:pavel-shirshov <notifications@github.com>
Sent At:2019 Aug. 5 (Mon.) 18:06
To:Azure/sonic-buildimage <sonic-buildimage@noreply.github.com>
Cc:Siyuan <siyuan.sun@alibaba-inc.com>; Mention <mention@noreply.github.com>
Subject:Re: [Azure/sonic-buildimage] apply kdump supported package and config on fsroot (#3276)
@sun-siyuan
Thank you for the output.
But after the crash the system rebooted itself?
Thanks
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the patch.
This patch looks good for me except two things:
- 256M of memory would be used for kdump.
- Reboot would take longer in case of the kernel crash
Probably we need to put this feature as an option in sonic-buildimage.
But if @lguohan ok we can keep it as it is. I'm ready to approve it.
Thanks
how do we decide to reserve 256M? I also agree with pavel, we should make it optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please make this feature optional with default no?
See comments from @lguohan
we were using 768M according to redhat recommend but consider it could be a waste of running memory, and test with 256M which works fine. I just update the PR to make it optional |
please check the new diff, which make this optional and default NO |
This comment has been minimized.
This comment has been minimized.
@pavel-shirshov please let me know whether you are ok with the new changes |
retest this please |
This comment has been minimized.
This comment has been minimized.
besides the build option, I think we should have a command line to enable/disable this kernel crash dump feature. like config kdump enable --size=265M |
c831d57
to
8d723c5
Compare
is there any plan to update the PR based on the feedback? |
this is addressed as well, it will be updated once I get all test done
------------------------------------------------------------------
From:lguohan <notifications@github.com>
Sent At:2019 Sep. 3 (Tue.) 12:31
To:Azure/sonic-buildimage <sonic-buildimage@noreply.github.com>
Cc:Siyuan <siyuan.sun@alibaba-inc.com>; Mention <mention@noreply.github.com>
Subject:Re: [Azure/sonic-buildimage] apply kdump supported package and config on fsroot (#3276)
@lguohan commented on this pull request.
In files/build_templates/sonic_debian_extension.j2:
@@ -331,6 +331,17 @@ sudo LANG=C DEBIAN_FRONTEND=noninteractive chroot $FILESYSTEM_ROOT apt-get purge
sudo LANG=C DEBIAN_FRONTEND=noninteractive chroot $FILESYSTEM_ROOT apt-get clean -y
sudo LANG=C DEBIAN_FRONTEND=noninteractive chroot $FILESYSTEM_ROOT apt-get autoremove -y
+# install kernel dump utility
+{%- if sonic_kdump_enable == "y" %}
+{% set coredump = [ "crash", "makedumpfile" ] -%}
+{% for pkg in coredump -%}
+sudo LANG=C DEBIAN_FRONTEND=noninteractive chroot $FILESYSTEM_ROOT apt-get install -y {{ pkg }}
+{% endfor -%}
+sudo LANG=C DEBIAN_FRONTEND=noninteractive chroot $FILESYSTEM_ROOT apt-get install -y kdump-tools || true
+sudo sed -i 's/\/MODULES=dep\//\/MODULES=most\//' $FILESYSTEM_ROOT/etc/kernel/postinst.d/kdump-tools
can you address this question?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As comments
vsimage failed due to no space error, please fix and trigger retest WARNING: No swap limit support Step 1/18 : FROM debian:stretch |
Summary: add kdump package to j2 template and config after Test Plan: test with new image, no issue observed Reviewers: P604087 Subscribers: P604087 Differential Revision: https://aone.alibaba-inc.com/code/D891921
correct crashkernel size to 256M, error introduced by cherry-pick
fix SONIC_ENABLE_KDUMP
…d build option to en/disable it
test failed with below error, need retest Setting status of 451664d to FAILURE with url https://sonic-jenkins.westus2.cloudapp.azure.com/job/broadcom/job/buildimage-brcm-all-pr/1278/ and message: 'Build finished. No test results found.' |
retest please |
please retest |
retest please |
retest all please |
retest this please |
core dump added by broadcom |
Summary: add kdump package to j2 template and config after
this is the to add kdump to capture the kernel crash core and for further analysis by crash tool
in this PR, contain two part
1, install kdump tool chain to host environment
2, configure kdump tool in both boot up via grub.cfg and system level
test done:
test build process, build sonic-broadcom.bin and sonic-aboot-broadcom.swi
-rwxr-xr-x 1 sun sun 562086493 Jul 28 21:28 sonic-aboot-broadcom.swi
-rw-r--r-- 1 sun sun 215882 Jul 28 21:28 sonic-aboot-broadcom.swi.log
-rwxr-xr-x 1 sun sun 569867542 Jul 28 00:52 sonic-broadcom.bin
-rw-r--r-- 1 sun sun 285585 Jul 28 00:52 sonic-broadcom.bin.log
test image, load sonic-broadcom.bin to switch
Signed-off-by: siyuan sun siyuan.sun@alibaba-inc.com