Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to convert the VFAT file system on Arista switches to EXT4 #201

Merged
merged 1 commit into from
Jan 24, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions build_debian.sh
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,13 @@ sudo dpkg --root=$FILESYSTEM_ROOT -i target/debs/linux-image-3.16.0-4-amd64_*.de
## Update initramfs for booting with squashfs+aufs
cat files/initramfs-tools/modules | sudo tee -a $FILESYSTEM_ROOT/etc/initramfs-tools/modules > /dev/null

## Hook into initramfs: change fs type from vfat to ext4 on arista switches
sudo mkdir -p $FILESYSTEM_ROOT/etc/initramfs-tools/scripts/init-premount/
sudo cp files/initramfs-tools/arista-convertfs $FILESYSTEM_ROOT/etc/initramfs-tools/scripts/init-premount/arista-convertfs
sudo chmod +x $FILESYSTEM_ROOT/etc/initramfs-tools/scripts/init-premount/arista-convertfs
sudo cp files/initramfs-tools/mke2fs $FILESYSTEM_ROOT/etc/initramfs-tools/hooks/mke2fs
sudo chmod +x $FILESYSTEM_ROOT/etc/initramfs-tools/hooks/mke2fs

## Hook into initramfs: after partition mount and loop file mount
## 1. Prepare layered file system
## 2. Bind-mount docker working directory (docker aufs cannot work over aufs rootfs)
Expand Down
170 changes: 170 additions & 0 deletions files/initramfs-tools/arista-convertfs
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
#!/bin/sh

case $1 in
prereqs)
exit 0
;;
esac

set -e
# set -x
total_mem=$(free | awk '/^Mem:/{print $2}')
tmpfs_size=$(( $total_mem / 20 * 17 ))
free_mem_thres=$(( $total_mem / 20 * 18 ))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these parameters good for any Arsita switches? What is the rule behind the numbers?

Copy link
Contributor Author

@byu343 byu343 Jan 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script need a memory-based tmpfs to backup and restore the files in the flash during the conversion. The size of created tmpfs is expected to be as large as possible to backup all files in the flash, but it is constrained by the free memory available.

In the script, tmpfs_size is the size of tmpfs to create, and we expect that the free memory space is at least free_mem_thres, which is tmpfs_size + some extra.

According to our testing, the script with the current settings can always create a tmpfs, and never fail at the check of free memory space (line 123). (The script runs as the early stage of booting, so an extremely large portion of the memory should be free.) A special case is that when tmpfs_size is smaller than the total size of files in the flash (line 136), as there is not enough space to backup all files, the booting will continue without any change to the file system and partition. (But this will never happen on the boxes with 2G flash and 4G memory.)

tmp_mnt='/mnt/ramdisk-convfs'
root_mnt='/mnt/root-convfs'
root_dev=''
flash_dev=''
block_flash=''
aboot_flag=''
backup_file=''

# Get the fullpath of flash device, e.g., /dev/sda
get_flash_dev() {
for dev in $(ls /sys/block); do
local is_mmc=$(echo "$dev" | grep 'mmcblk.*boot.*' | cat)
if [ -n "$is_mmc" ]; then
continue
fi
local devid=$(realpath "/sys/block/$dev/device")
local is_device=$(echo "$devid" | grep '^/sys/devices/' | cat)
local is_flash=$(echo "$devid" | grep "$block_flash" | cat)
if [ -n "$is_device" -a -n "$is_flash" ]; then
flash_dev="/dev/$dev"
return 0
fi
done
return 1
}

# Wait for root_dev to be ready
wait_for_root_dev() {
local try_rounds=30
while [ $try_rounds -gt 0 ]; do
if [ -e "$root_dev" ]; then
return 0
fi
sleep 1
try_rounds=$(( $try_rounds - 1 ))
done
return 1
}

# Alway run cleanup before exit
cleanup() {
if grep -q "$root_mnt" /proc/mounts; then
umount "$root_mnt"
fi
if grep -q "$tmp_mnt" /proc/mounts; then
umount "$tmp_mnt"
fi
[ -e "$root_mnt" ] && rmdir "$root_mnt"
[ -e "$tmp_mnt" ] && rmdir "$tmp_mnt"
}
trap cleanup EXIT

notification() {
cat << EOF
A failure happend in modifying the root file system which stopped the upgrade. Manual interventions are needed to fix the issue. Note that:
1) files in the old root file system may have been lost and the old partition table may have been corrupted;
2) The files in the old root file system were copied to $tmp_mnt;
3) The old partition table was dumped to the file $tmp_mnt/$backup_file by sfdisk;
4) Quitting the current shell will lose all files mentioned above permanently.
EOF
}

run_cmd() {
if ! eval "$1"; then
echo "$2"
notification
sh
exit 1
fi
}

# Extract kernel parameters
set -- $(cat /proc/cmdline)
Copy link
Collaborator

@qiluo-msft qiluo-msft Jan 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does 'set --' mean? #Closed

Copy link
Contributor Author

@byu343 byu343 Jan 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set is parsing = from a string.
The -- is to stop the option argument parsing for safeness.
You can see the difference from the examples below.
grep --help | grep -q
grep --help | grep -- -q #Closed

for x in "$@"; do
case "$x" in
block_flash=*)
block_flash="${x#block_flash=}"
;;
Aboot=*)
aboot_flag="${x#Aboot=}"
esac
done
root_dev="$ROOT"

#Check aboot and root_dev is vfat
[ -z "$aboot_flag" ] && exit 0
if [ -z "$root_dev" ]; then
echo "Error: root device name is not provided"
exit 1
fi
if ! wait_for_root_dev; then
echo "Error: timeout in waiting for $root_dev"
exit 1
fi
blkid | grep "$root_dev.*vfat" -q || exit 0
Copy link
Collaborator

@qiluo-msft qiluo-msft Jan 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also filter the specific partition label, or other things indicating Arista prepared partition? I think the 'grep' is too loose here. #Closed

Copy link
Contributor Author

@byu343 byu343 Jan 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually done in our script.
The check of Aboot=... kernel cmdline ensures that this is an Arista Switch. To differentiate Arista boxes from other platforms we use this parameter.
Here we obtain root=... from Aboot and it is stored as root_dev in this script, which is the root flash device. #Closed



# Get flash dev name
if [ -z "$block_flash" ]; then
echo "Error: flash device info is not provided"
exit 1
fi
if ! get_flash_dev; then
echo "Error: flash device is not found"
exit 1
fi

# Check memory size for tmpfs
free_mem=$(free | awk '/^Mem:/{print $4}')
if [ "$free_mem" -lt "$free_mem_thres" ]; then
echo "Error: memory is not enough"
exit 1
fi

# Backup partition table
mkdir -p "$root_mnt"
mount "$root_dev" "$root_mnt"
backup_file=backup.$(date +%Y-%m-%d.%H-%M-%S)
sfdisk -d "$flash_dev" > "$root_mnt/$backup_file"

# Check total size of files in root
total_file_size=$(du -s "$root_mnt" | awk '{print $1}')
if [ "$total_file_size" -gt "$tmpfs_size" ]; then
echo "Error: total file size is too large"
exit 1
fi

# Create tmpfs, and copy files to tmpfs
mkdir -p "$tmp_mnt"
mount -t tmpfs -o size="${tmpfs_size}k" tmpfs "$tmp_mnt"
cp -a "$root_mnt/." "$tmp_mnt/"
umount "$root_mnt"

#### Lines below will modify the root file system, so any failure will be trapped to shell for manual interventions.

# Create a new partition table (content in flash_dev will be deleted)
err_msg="Error: repartitioning $flash_dev failed"
cmd="echo ';' | sfdisk $flash_dev"
run_cmd "$cmd" "$err_msg"

sleep 5
err_msg="Error: timeout in waiting for $root_dev after repartition"
cmd="wait_for_root_dev"
run_cmd "$cmd" "$err_msg"

err_msg="Error: formatting to ext4 failed"
cmd="mke2fs -t ext4 -m2 -F -O '^huge_file' $root_dev"
run_cmd "$cmd" "$err_msg"

err_msg="Error: mounting $root_dev to $root_mnt failed"
cmd="mount -t ext4 $root_dev $root_mnt"
run_cmd "$cmd" "$err_msg"

err_msg="Error: copying files form $tmp_mnt to $root_mnt failed"
cmd="cp -a $tmp_mnt/. $root_mnt/"
run_cmd "$cmd" "$err_msg"

50 changes: 50 additions & 0 deletions files/initramfs-tools/mke2fs
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/bin/sh
#Part of the code is revised based on initramfs-tools/hooks/fsck and initramfs-tool is under GPL v2.

PREREQ=""

prereqs()
{
echo "$PREREQ"
}

case $1 in
prereqs)
prereqs
exit 0
;;
esac

. /usr/share/initramfs-tools/hook-functions

copy_exec /sbin/mke2fs
copy_exec /sbin/sfdisk
copy_exec /sbin/fdisk

fstypes="ext4"

for type in $fstypes; do
prog="/sbin/mkfs.${type}"
if [ -h "$prog" ]; then
link=$(readlink -f "$prog")
copy_exec "$link"
ln -s "$link" "${DESTDIR}/$prog"
elif [ -x "$prog" ] ; then
copy_exec "$prog"
else
echo "Warning: /sbin/mkfs.${type} doesn't exist, can't install to initramfs, ignoring."
fi
done

for type in $fstypes; do
prog="/sbin/fsck.${type}"
if [ -h "$prog" ]; then
link=$(readlink -f "$prog")
copy_exec "$link"
ln -s "$link" "${DESTDIR}/$prog"
elif [ -x "$prog" ] ; then
copy_exec "$prog"
else
echo "Warning: /sbin/fsck.${type} doesn't exist, can't install to initramfs, ignoring."
fi
done