Skip to content

Commit

Permalink
[Mellanox] Enhance FW upgrade mechanism (sonic-net#16090)
Browse files Browse the repository at this point in the history
### Why I did it

1. Enhance the diagnosis information collecting mechanism
   - If the option `-v` is fed, it will pass additional diagnosis flags to mlxfwmanager
   - Collect all the output from mlxfwmanager and print them to syslog if it fails
2. Abort syncd in case waiting for device or upgrading firmware fails

Signed-off-by: Stephen Sun <stephens@nvidia.com>

### How I did it

#### How to verify it

Regression and manual test
  • Loading branch information
stephenxs authored Sep 4, 2023
1 parent 78587ce commit b5e8c16
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 3 deletions.
6 changes: 5 additions & 1 deletion files/scripts/syncd.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,11 @@ function startplatform() {
/usr/bin/flint -d $_MST_DEVICE --clear_semaphore
fi

/usr/bin/mlnx-fw-upgrade.sh
/usr/bin/mlnx-fw-upgrade.sh -v
if [[ "$?" -ne "${EXIT_SUCCESS}" ]]; then
debug "Failed to upgrade fw. " "$?" "Restart syncd"
exit 1
fi
/etc/init.d/sxdkernel restart
debug "Firmware update procedure ended"
fi
Expand Down
15 changes: 13 additions & 2 deletions platform/mellanox/mlnx-fw-upgrade.j2
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ declare -rA FW_FILE_MAP=( \
IMAGE_UPGRADE="${NO_PARAM}"
SYSLOG_LOGGER="${NO_PARAM}"
VERBOSE_LEVEL="${VERBOSE_MIN}"
MFT_DIAGNOSIS_FLAGS=""

function PrintHelp() {
echo
Expand Down Expand Up @@ -82,6 +83,7 @@ function ParseArguments() {
;;
-v|--verbose)
VERBOSE_LEVEL="${VERBOSE_MAX}"
MFT_DIAGNOSIS_FLAGS="FLASH_ACCESS_DEBUG=1 FW_COMPS_DEBUG=1"
;;
-s|--syslog)
SYSLOG_LOGGER="${YES_PARAM}"
Expand Down Expand Up @@ -165,8 +167,16 @@ function WaitForDevice() {
while [[ ("${QUERY_RETRY_COUNT}" -lt "${QUERY_RETRY_COUNT_MAX}") && ("$?" -ne "${EXIT_SUCCESS}") ]]; do
sleep 1s
((QUERY_RETRY_COUNT++))
${QUERY_CMD} > /dev/null
output=$(eval ${MFT_DIAGNOSIS_FLAGS} ${QUERY_CMD}) > /dev/null
done

ERROR_CODE="$?"
if [[ "${ERROR_CODE}" != "${EXIT_SUCCESS}" ]]; then
# Exit failure and print the detailed information
echo "$output"
failure_msg="${output#*Fail : }"
ExitFailure "FW Query command: ${QUERY_CMD} failed to wait for device with error: ${failure_msg}"
fi
}

function GetAsicType() {
Expand Down Expand Up @@ -224,7 +234,7 @@ function RunCmd() {

function RunFwUpdateCmd() {
local ERROR_CODE="${EXIT_SUCCESS}"
local COMMAND="${BURN_CMD} $@"
local COMMAND="${MFT_DIAGNOSIS_FLAGS} ${BURN_CMD} $@"

if [[ "${VERBOSE_LEVEL}" -eq "${VERBOSE_MAX}" ]]; then
output=$(eval "${COMMAND}")
Expand All @@ -234,6 +244,7 @@ function RunFwUpdateCmd() {

ERROR_CODE="$?"
if [[ "${ERROR_CODE}" != "${EXIT_SUCCESS}" ]]; then
echo "${output}"
failure_msg="${output#*Fail : }"
ExitFailure "FW Update command: ${COMMAND} failed with error: ${failure_msg}"
fi
Expand Down

0 comments on commit b5e8c16

Please sign in to comment.