-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DellEMC] Watchdog support DellEMCS6100 #3187
Conversation
sudo sed -i 's/run_watchdog=1/run_watchdog=0/' $FILESYSTEM_ROOT/etc/default/watchdog | ||
sudo rm -rf $FILESYSTEM_ROOT/lib/systemd/system/wd_keepalive.service | ||
sudo rm -rf $FILESYSTEM_ROOT/etc/init.d/wd_keepalive | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Here enabled watchdog device in watchdog.conf
- Watchdog can be enabled only with sonic_platform API on need basis.
- Removed wd_keepalive support.
#Enable watcdog with nowayout | ||
rmmod iTCO_wdt | ||
modprobe iTCO_wdt nowayout=1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Added nowayout support.
- So, once watchdog is started we can't stop it.
When the device is closed, the watchdog is disabled, unless the "Magic
Close" feature is supported (see below). This is not always such a
good idea, since if there is a bug in the watchdog daemon and it
crashes the system will not reboot. Because of this, some of the
drivers support the configuration option "Disable watchdog shutdown on
close", CONFIG_WATCHDOG_NOWAYOUT. If it is set to Y when compiling
the kernel, there is no way of disabling the watchdog once it has been
started. So, if the watchdog daemon crashes, the system will reboot
after the timeout has passed. Watchdog devices also usually support
the nowayout module parameter so that this option can be controlled at
runtime.
https://www.kernel.org/doc/Documentation/watchdog/watchdog-api.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The platform API provides a function to "disarm" the watchdog. I'm not sure how frequently (or even if) this will be called. However, with "nowayout" enabled, it appears that there is no way to disable the watchdog after it has started. Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes. We can't able to stop the watchdog once we enable with "nowayout" option.
- The need for nowayout is if user-space watchdog daemon got crashed and accidentally if it close the /dev/watchdog node proerly then there is a possibility watchdog may never kick-in.
- To avoid this (slightest possibility) nowayout option is used.
self.write_config( | ||
self.WATCHDOG_DEFAULT_FILE, | ||
"run_watchdog=1", | ||
"run_watchdog=0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will disable the watchdog at next boot. However, this function is meant to disable the watchdog at runtime, in the event there may ever be a need. Is this not possible because of the "nowayout" feature enabled above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. You are right. The trade-off is with nowayout there is noway to stop watchdog. (Even with magic value)
@lguohan: Does this approach look good to you? How do you feel about the "nowayout" feature? |
…atically (#18314) #### Why I did it src/sonic-utilities ``` * 9d5dacab - (HEAD -> 202311, origin/202311) CLI to skip polling for periodic information for a port in DomInfoUpdateTask thread (#3187) (4 hours ago) [mihirpat1] ``` #### How I did it #### How to verify it #### Description for the changelog
…atically (#18331) #### Why I did it src/sonic-utilities ``` * c0ba32ad - (HEAD -> 202305, origin/202305) CLI to skip polling for periodic information for a port in DomInfoUpdateTask thread (#3187) (16 hours ago) [mihirpat1] * 261cfdf7 - CLI enhancements to revtrieve data from TRANSCEIVER_FIRMWARE_INFO table (#3177) (#3189) (19 hours ago) [mssonicbld] * 6160ee79 - [202305][config] Add YANG alerting for override (#3195) (20 hours ago) [jingwenxie] * a55624d8 - [fast/warm-reboot] Put ERR message in syslog when a failure is seen (#3186) (34 hours ago) [Vaibhav Hemant Dixit] ``` #### How I did it #### How to verify it #### Description for the changelog
…atically (#18240) #### Why I did it src/sonic-utilities ``` * bdc57206 - (HEAD -> master, origin/master, origin/HEAD) Revert "Fix for Switch Port Modes and VLAN CLI Enhancement (#3108)" (#3246) (89 minutes ago) [jingwenxie] * e35452b7 - Modify "show interface transceiver status" CLI to show SW cmis state (#3238) (2 days ago) [mihirpat1] * 04a33e1f - Add "state" field in CONFIG_DB a toggle of the fabric port monitor feature (#2932) (2 days ago) [jfeng-arista] * 3c489ba5 - Enhance route-check for multi-asic platforms (#3216) (5 days ago) [Deepak Singhal] * c149e48b - [chassis] Add chassis support for CLI "config qos reload" (#3233) (6 days ago) [wenyiz2021] * d8541add - Update port2alias (#3217) (8 days ago) [abdosi] * d4688a8f - [graceful reboot] Add the pre_reboot_hook script execution, add the watchdog arm before the reboot (#3203) (8 days ago) [Vadym Hlushko] * 125f36f3 - [ipintutil]Handle exception in show ip interfaces command (#3182) (10 days ago) [Sudharsan Dhamal Gopalarathnam] * 9d532017 - [chassis][show-runningconfig] Fix the show runningconfiguration all issue on the Supervisor (#3194) (2 weeks ago) [Marty Y. Lok] * 1a9261ce - [Techsupport]Handle SAI kv pair if present in sai common profile (#3196) (2 weeks ago) [Sudharsan Dhamal Gopalarathnam] * 7466dc4a - Skip the validation of action in acl-loader if capability table in STATE_DB is empty (#3199) (2 weeks ago) [bingwang-ms] * b879b658 - [Bug] Fix fw_setenv illegel character issue (#3201) (3 weeks ago) [xumia] * 0b41a560 - [config] Add YANG alerting for override (#3188) (3 weeks ago) [jingwenxie] * 24683b0c - [show] multi-asic show running test residue (#3198) (3 weeks ago) [jingwenxie] * 995a797a - CLI to skip polling for periodic information for a port in DomInfoUpdateTask thread (#3187) (3 weeks ago) [mihirpat1] * 9aa9eaa5 - [config] Add Table hard dependency check (#3159) (3 weeks ago) [jingwenxie] * 5f0ffcca - [fast/warm-reboot] Put ERR message in syslog when a failure is seen (#3186) (4 weeks ago) [Vaibhav Hemant Dixit] * 92220dcf - Fix for Switch Port Modes and VLAN CLI Enhancement (#3108) (4 weeks ago) [Saba Akram] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (#19554) #### Why I did it src/sonic-swss ``` * d3073b7c - (HEAD -> 202405, origin/202405) [muxorch] Fixing bug with updateRoute and mux neighbors (#3187) (19 hours ago) [Nikola Dancejic] * b16d6b2a - ADD VOQ COUNTERS(SAI_SWITCH_STAT_PACKET_INTEGRITY_DROP, SAI_QUEUE_ST…T_CREDIT_WD_DELETED_PACKETS) support for VOQ/Fabric switches (#3152) (19 hours ago) [saksarav-nokia] * 12a95e57 - Revamp module build script to make it work for 5.15 on Ubuntu 20.04 (#3212) (19 hours ago) [Saikrishna Arcot] * 87cf38e0 - Fix in switchorch: unsupported attribute causes skipping of processing the rest of configurations (#3209) (19 hours ago) [Amir] * 8f333b69 - [subnet decap] Support decap rule generation based on T0 VIP route (#3183) (5 weeks ago) [Longxiang Lyu] * 9bcb9b6e - Fixing appl_db FABRIC_MONITOR notification issue. (#3176) (5 weeks ago) [jfeng-arista] * fff544e6 - Rotate record file before writing new log. (#3158) (5 weeks ago) [mint570] * 80f52079 - Add SWSS support for link event damping feature (#2933) (5 weeks ago) [Roy Yi] * b3ebfc46 - [muxorch] Using bulker to program routes/neighbors during switchover (#3148) (5 weeks ago) [Nikola Dancejic] ``` #### How I did it #### How to verify it #### Description for the changelog
…lly (sonic-net#19250) #### Why I did it src/sonic-swss ``` * f497c4a0 - (HEAD -> master, origin/master, origin/HEAD) [muxorch] Fixing bug with updateRoute and mux neighbors (sonic-net#3187) (3 hours ago) [Nikola Dancejic] ``` #### How I did it #### How to verify it #### Description for the changelog
- What I did
- How I did it
- How to verify it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)
watchdot-test-script.zip