Skip to content

Commit

Permalink
add some retries for "kind create cluster" (#1523)
Browse files Browse the repository at this point in the history
Co-authored-by: Yecheng Fu <cofyc.jackson@gmail.com>
  • Loading branch information
sre-bot and cofyc committed Jan 10, 2020
1 parent a042ef9 commit c0582ba
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 1 deletion.
7 changes: 6 additions & 1 deletion hack/e2e.sh
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,12 @@ EOF
echo "error: no image for $KUBE_VERSION, exit"
exit 1
fi
$KIND_BIN create cluster --config $KUBECONFIG --name $CLUSTER --image $image --config $tmpfile -v 4
# Retry on error. Sometimes, kind will fail with the following error:
#
# OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:319: getting the final child's pid from pipe caused \"EOF\"": unknown
#
# TODO this error should be related to docker or linux kernel, find the root cause.
hack::wait_for_success 120 5 "$KIND_BIN create cluster --config $KUBECONFIG --name $CLUSTER --image $image --config $tmpfile -v 4"
# make it able to schedule pods on control-plane, then less resources we required
# This is disabled because when hostNetwork is used, pd requires 2379/2780
# which may conflict with etcd on control-plane.
Expand Down
21 changes: 21 additions & 0 deletions hack/lib.sh
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,24 @@ function hack::ensure_kind() {
function hack::version_ge() {
[ "$(printf '%s\n' "$1" "$2" | sort -V | head -n1)" = "$2" ]
}

# Usage:
#
# hack::wait_for_success 120 5 "cmd arg1 arg2 ... argn"
#
# Returns 0 if the shell command get output, 1 otherwise.
# From https://github.com/kubernetes/kubernetes/blob/v1.17.0/hack/lib/util.sh#L70
function hack::wait_for_success() {
local wait_time="$1"
local sleep_time="$2"
local cmd="$3"
while [ "$wait_time" -gt 0 ]; do
if eval "$cmd"; then
return 0
else
sleep "$sleep_time"
wait_time=$((wait_time-sleep_time))
fi
done
return 1
}

0 comments on commit c0582ba

Please sign in to comment.