Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove timeout for ipamd startup #874

Merged
merged 3 commits into from
Jun 24, 2020

Conversation

jaypipes
Copy link
Contributor

@jaypipes jaypipes commented Mar 18, 2020

Note: Edited by @mogren

Disable the timeout when checking the aws-k8s-agent (ipamd) startup in the entrypoint.sh script.

Try to talk to ipamd once every second. Reopened and changed because of #1028. We now rely on the liveness probe instead.

Related: #625, #865 #872

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Adds a configurable timeout to the aws-k8s-agent (ipamd) startup in the
entrypoint.sh script. Increases the default timeout from ~30 seconds to
60 seconds.

Users can set the IPAMD_TIMEOUT_SECONDS environment variable to change
the timeout.

Related: aws#625, aws#865 aws#872
@mogren
Copy link
Contributor

mogren commented May 20, 2020

I think we should remove the readiness probe instead. Also, kube-proxy should definitely be up within 30 seconds.

@mogren mogren closed this May 20, 2020
@mogren mogren reopened this Jun 23, 2020
Copy link
Contributor

@mogren mogren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit more complex, but at least now it's configurable...

@mogren mogren requested a review from haouc June 23, 2020 23:32
Copy link
Contributor

@haouc haouc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Just feel that time_left can minus sleep_time at the bottom of while loop and after ./grpc-health-probe. If the time_left <= 0, can exit with 1. Then don't have to have two ./grpc-health-probe.

@anguslees
Copy link
Contributor

anguslees commented Jun 24, 2020

Why would this need to be configurable, rather than just picking a better (bigger?) hardcoded value?

I think the code just aborts with an error when it hits this timeout, which is not useful. I think we could just try forever (no timeout) in entrypoint.sh, and leave any overall timeout up to the k8s livenessProbe settings.

@mogren
Copy link
Contributor

mogren commented Jun 24, 2020

@anguslees I agree, I just reopened this old PR because of #1028. I'll change it to just keep retrying until the liveness probe kills it.

Since we have a liveness probe restarting the probe, we can rely on that to kill the pod.
Copy link
Contributor

@anguslees anguslees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mogren mogren changed the title add configurable timeout for ipamd startup Remove timeout for ipamd startup Jun 24, 2020
@mogren mogren merged commit ad7df34 into aws:master Jun 24, 2020
bnapolitan added a commit to bnapolitan/amazon-vpc-cni-k8s that referenced this pull request Jul 1, 2020
commit d938e5e
Author: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>
Date:   Wed Jul 1 01:19:14 2020 +0000

    Json o/p for logs from entrypoint.sh

commit 2d20308
Author: Nathan Prabhu <natprabh@amazon.com>
Date:   Mon Jun 29 18:06:22 2020 -0500

    bugfix: make metrics-helper docker logging statement multi-arch compatible

commit bf9ded3
Author: Claes Mogren <claes.mogren@gmail.com>
Date:   Sat Jun 27 14:51:35 2020 -0700

    Use install command instead of cp

commit e3b7dbb
Author: Gyuho Lee <leegyuho@amazon.com>
Date:   Mon Jun 29 09:40:02 2020 -0700

    scripts/lib: bump up tester to v1.4.0

    Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

commit c369480
Author: Claes Mogren <claes.mogren@gmail.com>
Date:   Sun Jun 28 12:19:27 2020 -0700

    Some refresh cleanups

commit 8c266e9
Author: Claes Mogren <claes.mogren@gmail.com>
Date:   Sun Jun 28 18:37:46 2020 -0700

    Run staticcheck and clean up

commit 8dfc5b1
Author: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>
Date:   Sun Jun 28 17:39:20 2020 -0700

    Fix integration test script for code pipeline (aws#1062)

    Co-authored-by: Claes Mogren <mogren@amazon.com>

commit 52306be
Author: Murcherla <nithu0115@gmail.com>
Date:   Wed Jun 24 23:37:24 2020 -0500

    minor nits, fast follow up to PR 903

commit 4ddd248
Author: Claes Mogren <mogren@amazon.com>
Date:   Sun Jun 14 23:20:22 2020 -0700

    Add bandwidth plugin

commit 6d35fda
Author: Robert Sheehy <gameboy1092@gmail.com>
Date:   Fri May 22 21:11:12 2020 -0500

    Chain interface to other CNI plugins

commit 30f98bd
Author: Penugonda <saiteja313@gmail.com>
Date:   Thu Jun 25 15:14:00 2020 -0400

    removed custom networking default vars, introspection var

commit aa8b818
Author: Penugonda <saiteja313@gmail.com>
Date:   Wed Jun 24 19:11:38 2020 -0400

    updated manifest configs with default env vars

commit a073d66
Author: Nithish Murcherla <nithu0115@gmail.com>
Date:   Wed Jun 24 16:51:38 2020 -0500

    refresh subnet/CIDR information every 30 seconds and update ip rules to map pods (aws#903)

    Co-authored-by: Claes Mogren <mogren@amazon.com>

commit a0da387
Author: Claes Mogren <mogren@amazon.com>
Date:   Wed Jun 24 12:30:45 2020 -0700

    Default to random-fully (aws#1048)

commit 9fea153
Author: Claes Mogren <mogren@amazon.com>
Date:   Sun Jun 14 22:37:10 2020 -0700

    Update probe settings

    * Reduce readiness probe startup delay
    * Increase liveness polling period
    * Reduce shutdown grace period to 10 seconds

commit ad7df34
Author: Jay Pipes <jaypipes@gmail.com>
Date:   Wed Jun 24 02:06:23 2020 -0400

    Remove timeout for ipamd startup (aws#874)

    * add configurable timeout for ipamd startup

    Adds a configurable timeout to the aws-k8s-agent (ipamd) startup in the
    entrypoint.sh script. Increases the default timeout from ~30 seconds to
    60 seconds.

    Users can set the IPAMD_TIMEOUT_SECONDS environment variable to change
    the timeout.

    Related: aws#625, aws#865 aws#872

    * This is a local gRPC call, so just try every 1 second indefinitely

    Since we have a liveness probe restarting the probe, we can rely on that to kill the pod.

    Co-authored-by: Claes Mogren <mogren@amazon.com>

commit 1af40d2
Author: Jayanth Varavani <1111446+jayanthvn@users.noreply.github.com>
Date:   Fri Jun 19 10:14:44 2020 -0700

    Changelog and config file changes for v1.6.3

commit 14d5135
Author: Ari Becker <ari-becker@users.noreply.github.com>
Date:   Wed Jun 17 09:39:21 2020 +0300

    Generated the different configurations

commit 00395cb
Author: Ari Becker <ari-becker@users.noreply.github.com>
Date:   Tue Jun 16 14:33:55 2020 +0300

    Fix discovery RBAC issues in Kubernetes 1.17

commit 7e224af
Author: Gyuho Lee <leegyuho@amazon.com>
Date:   Mon Jun 15 16:04:44 2020 -0700

    scripts/lib/aws: bump up tester to v1.3.9

    Includes improvements to log fetcher + MNG deletion when metrics server
    is installed.

    Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

commit 36286ba
Author: Claes Mogren <mogren@amazon.com>
Date:   Mon Jun 15 07:56:59 2020 -0700

    Remove Printf and format test (aws#1027)

commit af54066
Author: Gyuho Lee <leegyuho@amazon.com>
Date:   Sat Jun 13 01:31:08 2020 -0700

    scripts/lib/aws: tester v1.3.6, enable color outputs (aws#1025)

    Includes various bug fixes + color output if $TERM is supported.
    Fallback to plain text output automatic.

    ref.
    https://github.com/aws/aws-k8s-tester/blob/master/CHANGELOG/CHANGELOG-1.3.md#v136-2020-06-12

    Signed-off-by: Gyuho Lee <leegyuho@amazon.com>

commit 6d52e1b
Author: jayanthvn <1111446+jayanthvn@users.noreply.github.com>
Date:   Fri Jun 12 16:26:33 2020 -0700

    added warning if delete on termination is set to false for the primar… (aws#1024)

    * Added a warning message if delete on termination is set to false for the primary ENI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants