From ebf005e38e566ab7096bf6819d9fbceb1c21da6b Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Thu, 15 Apr 2021 19:20:27 +0200 Subject: [PATCH 01/17] feat: ability to emit Kubernetes events --- README.md | 30 ++-- cmd/node-termination-handler.go | 29 ++- .../aws-node-termination-handler/README.md | 7 +- .../templates/clusterrole.yaml | 13 ++ .../templates/daemonset.linux.yaml | 4 + .../templates/daemonset.windows.yaml | 4 + .../templates/deployment.yaml | 4 + .../templates/psp.yaml | 2 +- .../aws-node-termination-handler/values.yaml | 7 + go.mod | 1 + go.sum | 117 ++++++++++++ pkg/config/config.go | 36 ++-- pkg/observability/k8s-events.go | 166 ++++++++++++++++++ test/README.md | 8 +- test/e2e/rebalance-recommendation-drain-test | 3 - test/e2e/spot-interruption-test-events-on | 164 +++++++++++++++++ 16 files changed, 555 insertions(+), 40 deletions(-) create mode 100644 pkg/observability/k8s-events.go create mode 100755 test/e2e/spot-interruption-test-events-on diff --git a/README.md b/README.md index ab0899c3..4cc2b374 100644 --- a/README.md +++ b/README.md @@ -31,11 +31,11 @@ ## Project Summary -This project ensures that the Kubernetes control plane responds appropriately to events that can cause your EC2 instance to become unavailable, such as [EC2 maintenance events](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-instances-status-check_sched.html), [EC2 Spot interruptions](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html), [ASG Scale-In](https://docs.aws.amazon.com/autoscaling/ec2/userguide/AutoScalingGroupLifecycle.html#as-lifecycle-scale-in), [ASG AZ Rebalance](https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-scaling-benefits.html#AutoScalingBehavior.InstanceUsage), and EC2 Instance Termination via the API or Console. If not handled, your application code may not stop gracefully, take longer to recover full availability, or accidentally schedule work to nodes that are going down. +This project ensures that the Kubernetes control plane responds appropriately to events that can cause your EC2 instance to become unavailable, such as [EC2 maintenance events](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-instances-status-check_sched.html), [EC2 Spot interruptions](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html), [ASG Scale-In](https://docs.aws.amazon.com/autoscaling/ec2/userguide/AutoScalingGroupLifecycle.html#as-lifecycle-scale-in), [ASG AZ Rebalance](https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-scaling-benefits.html#AutoScalingBehavior.InstanceUsage), and EC2 Instance Termination via the API or Console. If not handled, your application code may not stop gracefully, take longer to recover full availability, or accidentally schedule work to nodes that are going down. -The aws-node-termination-handler (NTH) can operate in two different modes: Instance Metadata Service (IMDS) or the Queue Processor. +The aws-node-termination-handler (NTH) can operate in two different modes: Instance Metadata Service (IMDS) or the Queue Processor. -The aws-node-termination-handler **[Instance Metadata Service](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) Monitor** will run a small pod on each host to perform monitoring of IMDS paths like `/spot` or `/events` and react accordingly to drain and/or cordon the corresponding node. +The aws-node-termination-handler **[Instance Metadata Service](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) Monitor** will run a small pod on each host to perform monitoring of IMDS paths like `/spot` or `/events` and react accordingly to drain and/or cordon the corresponding node. The aws-node-termination-handler **Queue Processor** will monitor an SQS queue of events from Amazon EventBridge for ASG lifecycle events, EC2 status change events, and Spot Interruption Termination Notice events. When NTH detects an instance is going down, we use the Kubernetes API to cordon the node to ensure no new work is scheduled there, then drain it, removing any existing work. The termination handler **Queue Processor** requires AWS IAM permissions to monitor and manage the SQS queue and to query the EC2 API. The queue processor mode is currently in a beta preview, but we'd love your feedback on it! @@ -52,7 +52,7 @@ You can run the termination handler on any Kubernetes cluster running on AWS, in - Unit & Integration Tests ### Queue Processor -- Monitors an SQS Queue for: +- Monitors an SQS Queue for: - EC2 Spot Interruption Notifications - EC2 Instance Rebalance Recommendation - EC2 Auto-Scaling Group Termination Lifecycle Hooks to take care of ASG Scale-In, AZ-Rebalance, Unhealthy Instances, and more! @@ -82,10 +82,10 @@ IMDS Processor Mode allows for a fine-grained configuration of IMDS paths that a - `enableSpotInterruptionDraining` - `enableRebalanceMonitoring` - `enableScheduledEventDraining` - + The `enableSqsTerminationDraining` must be set to false for these configuration values to be considered. -The Queue Processor Mode does not allow for fine-grained configuration of which events are handled through helm configuration keys. Instead, you can modify your Amazon EventBridge rules to not send certain types of events to the SQS Queue so that NTH does not process those events. +The Queue Processor Mode does not allow for fine-grained configuration of which events are handled through helm configuration keys. Instead, you can modify your Amazon EventBridge rules to not send certain types of events to the SQS Queue so that NTH does not process those events. The `enableSqsTerminationDraining` flag turns on Queue Processor Mode. When Queue Processor Mode is enabled, IMDS mode cannot be active. NTH cannot respond to queue events AND monitor IMDS paths. Queue Processor Mode still queries for node information on startup, but this information is not required for normal operation, so it is safe to disable IMDS for the NTH pod. @@ -212,12 +212,12 @@ $ aws autoscaling create-or-update-tags \ The value of the key does not matter. -This functionality is helpful in accounts where there are ASGs that do not run kubernetes nodes or you do not want aws-node-termination-handler to manage their termination lifecycle. +This functionality is helpful in accounts where there are ASGs that do not run kubernetes nodes or you do not want aws-node-termination-handler to manage their termination lifecycle. However, if your account is dedicated to ASGs for your kubernetes cluster, then you can turn off the ASG tag check by setting the flag `--check-asg-tag-before-draining=false` or environment variable `CHECK_ASG_TAG_BEFORE_DRAINING=false`. -You can also control what resources NTH manages by adding the resource ARNs to your Amazon EventBridge rules. +You can also control what resources NTH manages by adding the resource ARNs to your Amazon EventBridge rules. -Take a look at the docs on how to create rules that only manage certain ASGs here: https://docs.aws.amazon.com/autoscaling/ec2/userguide/cloud-watch-events.html +Take a look at the docs on how to create rules that only manage certain ASGs here: https://docs.aws.amazon.com/autoscaling/ec2/userguide/cloud-watch-events.html See all the different events docs here: https://docs.aws.amazon.com/eventbridge/latest/userguide/event-types.html#auto-scaling-event-types @@ -233,7 +233,7 @@ $ QUEUE_POLICY=$(cat < /tmp/queue-attributes.json { "MessageRetentionPeriod": "300", @@ -258,7 +258,7 @@ $ cat << EOF > /tmp/queue-attributes.json } EOF -$ aws sqs create-queue --queue-name "${SQS_QUEUE_NAME}" --attributes file:///tmp/queue-attributes.json +$ aws sqs create-queue --queue-name "${SQS_QUEUE_NAME}" --attributes file:///tmp/queue-attributes.json ``` #### 4. Create an Amazon EventBridge Rule @@ -389,7 +389,7 @@ For a full list of releases and associated artifacts see our [releases page](htt Use with Kiam
-## Use with Kiam +## Use with Kiam If you are using IMDS mode which defaults to `hostNetworking: true`, or if you are using queue-processor mode, then this section does not apply. The configuration below only needs to be used if you are explicitly changing NTH IMDS mode to `hostNetworking: false` . @@ -428,7 +428,7 @@ For build instructions please consult [BUILD.md](./BUILD.md). ## Communication * If you've run into a bug or have a new feature request, please open an [issue](https://github.com/aws/aws-node-termination-handler/issues/new). * You can also chat with us in the [Kubernetes Slack](https://kubernetes.slack.com) in the `#provider-aws` channel -* Check out the open source [Amazon EC2 Spot Instances Integrations Roadmap](https://github.com/aws/ec2-spot-instances-integrations-roadmap) to see what we're working on and give us feedback! +* Check out the open source [Amazon EC2 Spot Instances Integrations Roadmap](https://github.com/aws/ec2-spot-instances-integrations-roadmap) to see what we're working on and give us feedback! ## Contributing Contributions are welcome! Please read our [guidelines](https://github.com/aws/aws-node-termination-handler/blob/main/CONTRIBUTING.md) and our [Code of Conduct](https://github.com/aws/aws-node-termination-handler/blob/main/CODE_OF_CONDUCT.md) diff --git a/cmd/node-termination-handler.go b/cmd/node-termination-handler.go index 02c247d5..7368b28b 100644 --- a/cmd/node-termination-handler.go +++ b/cmd/node-termination-handler.go @@ -99,6 +99,12 @@ func main() { log.Fatal().Err(err).Msg("Unable to instantiate probes service,") } + recorder, err := observability.InitK8sEventRecorder(nthConfig.EmitKubernetesEvents, nthConfig.KubernetesEventsAnnotations, nthConfig.NodeName) + if err != nil { + nthConfig.Print() + log.Fatal().Err(err).Msg("Unable to create Kubernetes event recorder,") + } + imds := ec2metadata.New(nthConfig.MetadataURL, nthConfig.MetadataTries) interruptionEventStore := interruptioneventstore.New(nthConfig) @@ -179,6 +185,7 @@ func main() { if err != nil { log.Warn().Str("event_type", monitor.Kind()).Err(err).Msg("There was a problem monitoring for events") metrics.ErrorEventsInc(monitor.Kind()) + recorder.Emit(observability.Warning, observability.MonitorErrReason, observability.MonitorErrMsgFmt, monitor.Kind()) if previousErr != nil && err.Error() == previousErr.Error() { duplicateErrCount++ } else { @@ -198,7 +205,7 @@ func main() { log.Info().Msg("Started watching for interruption events") log.Info().Msg("Kubernetes AWS Node Termination Handler has started successfully!") - go watchForCancellationEvents(cancelChan, interruptionEventStore, node, metrics) + go watchForCancellationEvents(cancelChan, interruptionEventStore, node, metrics, recorder) log.Info().Msg("Started watching for event cancellations") var wg sync.WaitGroup @@ -214,7 +221,8 @@ func main() { case interruptionEventStore.Workers <- 1: event.InProgress = true wg.Add(1) - go drainOrCordonIfNecessary(interruptionEventStore, event, *node, nthConfig, nodeMetadata, metrics, &wg) + recorder.Emit(observability.Normal, observability.GetReasonForKind(event.Kind), event.Description) + go drainOrCordonIfNecessary(interruptionEventStore, event, *node, nthConfig, nodeMetadata, metrics, recorder, &wg) default: log.Warn().Msg("all workers busy, waiting") break @@ -254,7 +262,7 @@ func watchForInterruptionEvents(interruptionChan <-chan monitor.InterruptionEven } } -func watchForCancellationEvents(cancelChan <-chan monitor.InterruptionEvent, interruptionEventStore *interruptioneventstore.Store, node *node.Node, metrics observability.Metrics) { +func watchForCancellationEvents(cancelChan <-chan monitor.InterruptionEvent, interruptionEventStore *interruptioneventstore.Store, node *node.Node, metrics observability.Metrics, recorder observability.K8sEventRecorder) { for { interruptionEvent := <-cancelChan nodeName := interruptionEvent.NodeName @@ -264,8 +272,10 @@ func watchForCancellationEvents(cancelChan <-chan monitor.InterruptionEvent, int err := node.Uncordon(nodeName) if err != nil { log.Err(err).Msg("Uncordoning the node failed") + recorder.Emit(observability.Warning, observability.UncordonErrReason, observability.UncordonErrMsgFmt, err.Error()) } metrics.NodeActionsInc("uncordon", nodeName, err) + recorder.Emit(observability.Normal, observability.UncordonReason, observability.UncordonMsg) node.RemoveNTHLabels(nodeName) node.RemoveNTHTaints(nodeName) @@ -275,7 +285,7 @@ func watchForCancellationEvents(cancelChan <-chan monitor.InterruptionEvent, int } } -func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Store, drainEvent *monitor.InterruptionEvent, node node.Node, nthConfig config.Config, nodeMetadata ec2metadata.NodeMetadata, metrics observability.Metrics, wg *sync.WaitGroup) { +func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Store, drainEvent *monitor.InterruptionEvent, node node.Node, nthConfig config.Config, nodeMetadata ec2metadata.NodeMetadata, metrics observability.Metrics, recorder observability.K8sEventRecorder, wg *sync.WaitGroup) { defer wg.Done() nodeName := drainEvent.NodeName nodeLabels, err := node.GetNodeLabels(nodeName) @@ -287,6 +297,9 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto err := drainEvent.PreDrainTask(*drainEvent, node) if err != nil { log.Err(err).Msg("There was a problem executing the pre-drain task") + recorder.Emit(observability.Warning, observability.PreDrainErrReason, observability.PreDrainErrMsgFmt, err.Error()) + } else { + recorder.Emit(observability.Normal, observability.PreDrainReason, observability.PreDrainMsg) } metrics.NodeActionsInc("pre-drain", nodeName, err) } @@ -298,6 +311,7 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto log.Err(err).Msgf("node '%s' not found in the cluster", nodeName) } else { log.Err(err).Msg("There was a problem while trying to cordon the node") + recorder.Emit(observability.Warning, observability.CordonErrReason, observability.CordonErrMsgFmt, err.Error()) os.Exit(1) } } else { @@ -312,6 +326,7 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto log.Err(err).Msg("There was a problem while trying to log all pod names on the node") } metrics.NodeActionsInc("cordon", nodeName, err) + recorder.Emit(observability.Normal, observability.CordonReason, observability.CordonMsg) } } else { err := node.CordonAndDrain(nodeName) @@ -320,11 +335,14 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto log.Err(err).Msgf("node '%s' not found in the cluster", nodeName) } else { log.Err(err).Msg("There was a problem while trying to cordon and drain the node") + metrics.NodeActionsInc("cordon-and-drain", nodeName, err) + recorder.Emit(observability.Warning, observability.CordonAndDrainErrReason, observability.CordonAndDrainErrMsgFmt, err.Error()) os.Exit(1) } } else { log.Info().Str("node_name", nodeName).Msg("Node successfully cordoned and drained") metrics.NodeActionsInc("cordon-and-drain", nodeName, err) + recorder.Emit(observability.Normal, observability.CordonAndDrainReason, observability.CordonAndDrainMsg) } } @@ -336,6 +354,9 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto err := drainEvent.PostDrainTask(*drainEvent, node) if err != nil { log.Err(err).Msg("There was a problem executing the post-drain task") + recorder.Emit(observability.Warning, observability.PostDrainErrReason, observability.PostDrainErrMsgFmt, err.Error()) + } else { + recorder.Emit(observability.Normal, observability.PostDrainReason, observability.PostDrainMsg) } metrics.NodeActionsInc("post-drain", nodeName, err) } diff --git a/config/helm/aws-node-termination-handler/README.md b/config/helm/aws-node-termination-handler/README.md index 945b4f4b..fd4fc6bd 100644 --- a/config/helm/aws-node-termination-handler/README.md +++ b/config/helm/aws-node-termination-handler/README.md @@ -74,14 +74,16 @@ Parameter | Description | Default `logLevel` | Sets the log level (INFO, DEBUG, or ERROR) | `INFO` `enablePrometheusServer` | If true, start an http server exposing `/metrics` endpoint for prometheus. | `false` `prometheusServerPort` | Replaces the default HTTP port for exposing prometheus metrics. | `9092` -`enableProbesServer` |If true, start an http server exposing `/healthz` endpoint for probes. | `false` +`enableProbesServer` | If true, start an http server exposing `/healthz` endpoint for probes. | `false` `probesServerPort` | Replaces the default HTTP port for exposing probes endpoint. | `8080` `probesServerEndpoint` | Replaces the default endpoint for exposing probes endpoint. | `/healthz` `podMonitor.create` | if `true`, create a PodMonitor | `false` `podMonitor.interval` | Prometheus scrape interval | `30s` `podMonitor.sampleLimit` | Number of scraped samples accepted | `5000` `podMonitor.labels` | Additional PodMonitor metadata labels | `{}` -`podMonitor.namespace` | override podMonitor Helm release namespace | `{{.Release.Namespace}}` +`podMonitor.namespace` | override podMonitor Helm release namespace | `{{ .Release.Namespace }}` +`emitKubernetesEvents` | If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes | `false` +`kubernetesEventsAnnotations` | A comma-separated list of key=value annotations to attach to all emitted Kubernetes events. Example: `first=annotation,sample.annotation/number=two"` | None ### AWS Node Termination Handler - Queue-Processor Mode Configuration @@ -153,6 +155,7 @@ Parameter | Description | Default `dryRun` | If true, only log if a node would be drained | `false` ## Metrics endpoint consideration + NTH in IMDS mode runs as a DaemonSet w/ `host_networking=true` by default. If the prometheus server is enabled, nothing else will be able to bind to the configured port (by default `:9092`) in the root network namespace. Therefore, it will need to have a firewall/security group configured on the nodes to block access to the `/metrics` endpoint. You can switch NTH in IMDS mode to run w/ `host_networking=false`, but you will need to make sure that IMDSv1 is enabled or IMDSv2 IP hop count will need to be incremented to 2. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html diff --git a/config/helm/aws-node-termination-handler/templates/clusterrole.yaml b/config/helm/aws-node-termination-handler/templates/clusterrole.yaml index eaa19713..42f8c6cc 100644 --- a/config/helm/aws-node-termination-handler/templates/clusterrole.yaml +++ b/config/helm/aws-node-termination-handler/templates/clusterrole.yaml @@ -37,3 +37,16 @@ rules: - daemonsets verbs: - get +{{- if .Values.emitKubernetesEvents }} +- apiGroups: + - "" + resources: + - events + verbs: + - create + - get + - list + - patch + - update + - watch +{{- end }} diff --git a/config/helm/aws-node-termination-handler/templates/daemonset.linux.yaml b/config/helm/aws-node-termination-handler/templates/daemonset.linux.yaml index 8b0a9034..27a536dd 100644 --- a/config/helm/aws-node-termination-handler/templates/daemonset.linux.yaml +++ b/config/helm/aws-node-termination-handler/templates/daemonset.linux.yaml @@ -174,6 +174,10 @@ spec: value: {{ .Values.probesServerPort | quote }} - name: PROBES_SERVER_ENDPOINT value: {{ .Values.probesServerEndpoint | quote }} + - name: EMIT_KUBERNETES_EVENTS + value: {{ .Values.emitKubernetesEvents | quote }} + - name: KUBERNETES_EVENTS_ANNOTATIONS + value: {{ .Values.kubernetesEventsAnnotations | quote }} resources: {{- toYaml .Values.resources | nindent 12 }} {{- if .Values.enablePrometheusServer }} diff --git a/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml b/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml index 00ae3411..98df588e 100644 --- a/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml +++ b/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml @@ -148,6 +148,10 @@ spec: value: {{ .Values.probesServerPort | quote }} - name: PROBES_SERVER_ENDPOINT value: {{ .Values.probesServerEndpoint | quote }} + - name: EMIT_KUBERNETES_EVENTS + value: {{ .Values.emitKubernetesEvents | quote }} + - name: KUBERNETES_EVENTS_ANNOTATIONS + value: {{ .Values.kubernetesEventsAnnotations | quote }} resources: {{- toYaml .Values.resources | nindent 12 }} {{- if .Values.enablePrometheusServer }} diff --git a/config/helm/aws-node-termination-handler/templates/deployment.yaml b/config/helm/aws-node-termination-handler/templates/deployment.yaml index a734c6fa..bc681872 100644 --- a/config/helm/aws-node-termination-handler/templates/deployment.yaml +++ b/config/helm/aws-node-termination-handler/templates/deployment.yaml @@ -150,6 +150,10 @@ spec: value: {{ .Values.managedAsgTag | quote }} - name: WORKERS value: {{ .Values.workers | quote }} + - name: EMIT_KUBERNETES_EVENTS + value: {{ .Values.emitKubernetesEvents | quote }} + - name: KUBERNETES_EVENTS_ANNOTATIONS + value: {{ .Values.kubernetesEventsAnnotations | quote }} resources: {{- toYaml .Values.resources | nindent 12 }} {{- if .Values.enablePrometheusServer }} diff --git a/config/helm/aws-node-termination-handler/templates/psp.yaml b/config/helm/aws-node-termination-handler/templates/psp.yaml index 34fb6f24..8254951d 100644 --- a/config/helm/aws-node-termination-handler/templates/psp.yaml +++ b/config/helm/aws-node-termination-handler/templates/psp.yaml @@ -10,7 +10,7 @@ metadata: spec: privileged: false hostIPC: false - hostNetwork: {{ .Values.useHostNetwork }} + hostNetwork: {{ .Values.useHostNetwork }} hostPID: false {{- if and .Values.rbac.pspEnabled .Values.enablePrometheusServer }} hostPorts: diff --git a/config/helm/aws-node-termination-handler/values.yaml b/config/helm/aws-node-termination-handler/values.yaml index a5d5755a..1a655d9d 100644 --- a/config/helm/aws-node-termination-handler/values.yaml +++ b/config/helm/aws-node-termination-handler/values.yaml @@ -159,6 +159,13 @@ enableProbesServer: false probesServerPort: 8080 probesServerEndpoint: "/healthz" +# emitKubernetesEvents If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes +emitKubernetesEvents: false + +# kubernetesEventsAnnotations A comma-separated list of key=value annotations to attach to all emitted Kubernetes events +# Example: "first=annotation,sample.annotation/number=two" +kubernetesEventsAnnotations: "" + tolerations: - operator: "Exists" diff --git a/go.mod b/go.mod index e20ab753..d007fe02 100644 --- a/go.mod +++ b/go.mod @@ -4,6 +4,7 @@ go 1.15 require ( github.com/aws/aws-sdk-go v1.33.1 + github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect github.com/rs/zerolog v1.18.0 go.opentelemetry.io/contrib/instrumentation/runtime v0.6.1 go.opentelemetry.io/otel v0.6.0 diff --git a/go.sum b/go.sum index 37c1e09e..7b5615f0 100644 --- a/go.sum +++ b/go.sum @@ -1,19 +1,31 @@ cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw= cloud.google.com/go v0.34.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw= +cloud.google.com/go v0.38.0 h1:ROfEUZz+Gh5pa62DJWXSaonyu3StP6EA6lPEXPI6mCo= cloud.google.com/go v0.38.0/go.mod h1:990N+gfupTy94rShfmMCWGDn0LpTmnzTp2qbd1dvSRU= +github.com/Azure/go-ansiterm v0.0.0-20170929234023-d6e3b3328b78 h1:w+iIsaOQNcT7OZ575w+acHgRric5iCyQh+xv+KJ4HB8= github.com/Azure/go-ansiterm v0.0.0-20170929234023-d6e3b3328b78/go.mod h1:LmzpDX56iTiv29bbRTIsUNlaFfuhWRQBWjQdVyAevI8= +github.com/Azure/go-autorest/autorest v0.9.0 h1:MRvx8gncNaXJqOoLmhNjUAKh33JJF8LyxPhomEtOsjs= github.com/Azure/go-autorest/autorest v0.9.0/go.mod h1:xyHB1BMZT0cuDHU7I0+g046+BFDTQ8rEZB0s4Yfa6bI= +github.com/Azure/go-autorest/autorest/adal v0.5.0 h1:q2gDruN08/guU9vAjuPWff0+QIrpH6ediguzdAzXAUU= github.com/Azure/go-autorest/autorest/adal v0.5.0/go.mod h1:8Z9fGy2MpX0PvDjB1pEgQTmVqjGhiHBW7RJJEciWzS0= +github.com/Azure/go-autorest/autorest/date v0.1.0 h1:YGrhWfrgtFs84+h0o46rJrlmsZtyZRg470CqAXTZaGM= github.com/Azure/go-autorest/autorest/date v0.1.0/go.mod h1:plvfp3oPSKwf2DNjlBjWF/7vwR+cUD/ELuzDCXwHUVA= github.com/Azure/go-autorest/autorest/mocks v0.1.0/go.mod h1:OTyCOPRA2IgIlWxVYxBee2F5Gr4kF2zd2J5cFRaIDN0= +github.com/Azure/go-autorest/autorest/mocks v0.2.0 h1:Ww5g4zThfD/6cLb4z6xxgeyDa7QDkizMkJKe0ysZXp0= github.com/Azure/go-autorest/autorest/mocks v0.2.0/go.mod h1:OTyCOPRA2IgIlWxVYxBee2F5Gr4kF2zd2J5cFRaIDN0= +github.com/Azure/go-autorest/logger v0.1.0 h1:ruG4BSDXONFRrZZJ2GUXDiUyVpayPmb1GnWeHDdaNKY= github.com/Azure/go-autorest/logger v0.1.0/go.mod h1:oExouG+K6PryycPJfVSxi/koC6LSNgds39diKLz7Vrc= +github.com/Azure/go-autorest/tracing v0.5.0 h1:TRn4WjSnkcSy5AEG3pnbtFSwNtwzjr4VYyQflFE619k= github.com/Azure/go-autorest/tracing v0.5.0/go.mod h1:r/s2XiOKccPW3HrqB+W0TQzfbtp2fGCgRFtBroKn4Dk= +github.com/BurntSushi/toml v0.3.1 h1:WXkYYl6Yr3qBf1K79EBnL4mak0OimBfB0XUf9Vl28OQ= github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU= +github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802 h1:1BDTz0u9nC3//pOCMdNH+CiXJVYJh5UQNCOBG7jbELc= github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo= github.com/DataDog/sketches-go v0.0.0-20190923095040-43f19ad77ff7 h1:qELHH0AWCvf98Yf+CNIJx9vOZOfHFDDzgDRYsnNk/vs= github.com/DataDog/sketches-go v0.0.0-20190923095040-43f19ad77ff7/go.mod h1:Q5DbzQ+3AkgGwymQO7aZFNP7ns2lZKGtvRBzRXfdi60= +github.com/MakeNowJust/heredoc v0.0.0-20170808103936-bb23615498cd h1:sjQovDkwrZp8u+gxLtPgKGjk5hCxuy2hrRejBTA9xFU= github.com/MakeNowJust/heredoc v0.0.0-20170808103936-bb23615498cd/go.mod h1:64YHyfSL2R96J44Nlwm39UHepQbyR5q10x7iYa1ks2E= +github.com/NYTimes/gziphandler v0.0.0-20170623195520-56545f4a5d46 h1:lsxEuwrXEAokXB9qhlbKWPpo3KMLZQ5WB5WLQRW1uq0= github.com/NYTimes/gziphandler v0.0.0-20170623195520-56545f4a5d46/go.mod h1:3wb06e3pkSAbeQ52E9H9iFoQsEEwGN64994WTCIhntQ= github.com/PuerkitoBio/purell v1.0.0/go.mod h1:c11w/QuzBsJSee3cPx9rAFu61PvFxuPbtSwDGJws/X0= github.com/PuerkitoBio/purell v1.1.1 h1:WEQqlqaGbrPkxLJWfBwQmfEAE1Z7ONdDLqrN38tNFfI= @@ -22,9 +34,12 @@ github.com/PuerkitoBio/urlesc v0.0.0-20160726150825-5bd2802263f2/go.mod h1:uGdko github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578 h1:d+Bc7a5rLufV/sSk/8dngufqelfh6jnri85riMAaF/M= github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578/go.mod h1:uGdkoq3SwY9Y+13GIhn11/XLaGBb4BfwItxLd5jeuXE= github.com/alecthomas/template v0.0.0-20160405071501-a0175ee3bccc/go.mod h1:LOuyumcjzFXgccqObfd/Ljyb9UuFJ6TxHnclSeseNhc= +github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751 h1:JYp7IbQjafoB+tBA3gMyHYHrpOtNuDiK/uB5uXxq5wM= github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751/go.mod h1:LOuyumcjzFXgccqObfd/Ljyb9UuFJ6TxHnclSeseNhc= github.com/alecthomas/units v0.0.0-20151022065526-2efee857e7cf/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0= +github.com/alecthomas/units v0.0.0-20190717042225-c3de453c63f4 h1:Hs82Z41s6SdL1CELW+XaDYmOH4hkBN4/N9og/AsOv7E= github.com/alecthomas/units v0.0.0-20190717042225-c3de453c63f4/go.mod h1:ybxpYRFXyAe+OPACYpWeL0wqObRcbAqCMya13uyzqw0= +github.com/armon/consul-api v0.0.0-20180202201655-eb2c6b5be1b6 h1:G1bPvciwNyF7IUmKXNt9Ak3m6u9DE1rF+RmtIkBpVdA= github.com/armon/consul-api v0.0.0-20180202201655-eb2c6b5be1b6/go.mod h1:grANhF5doyWs3UAsr3K4I6qtAmlQcZDesFNEHPZAzj8= github.com/aws/aws-sdk-go v1.33.1 h1:yz9XmNzPshz/lhfAZvLfMnIS9HPo8+boGRcWqDVX+T0= github.com/aws/aws-sdk-go v1.33.1/go.mod h1:5zCpMtNQVjRREroY7sYe8lOMRSxkhG6MZveU8YkpAk0= @@ -35,43 +50,67 @@ github.com/beorn7/perks v1.0.0 h1:HWo1m869IqiPhD389kmkxeTalrjNbbJTC8LXupb+sl0= github.com/beorn7/perks v1.0.0/go.mod h1:KWe93zE9D1o94FZ5RNwFwVgaQK1VOXiVxmqh+CedLV8= github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM= github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw= +github.com/blang/semver v3.5.0+incompatible h1:CGxCgetQ64DKk7rdZ++Vfnb1+ogGNnB17OJKJXD2Cfs= github.com/blang/semver v3.5.0+incompatible/go.mod h1:kRBLl5iJ+tD4TcOOxsy/0fnwebNt5EWlYSAyrTnjyyk= +github.com/census-instrumentation/opencensus-proto v0.2.1 h1:glEXhBS5PSLLv4IXzLA5yPRVX4bilULVyxxbrfOtDAk= github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU= github.com/cespare/xxhash/v2 v2.1.1 h1:6MnRN8NT7+YBpUIWxHtefFZOKTAPgGjpQSxqLNn0+qY= github.com/cespare/xxhash/v2 v2.1.1/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= +github.com/chai2010/gettext-go v0.0.0-20160711120539-c6fed771bfd5 h1:7aWHqerlJ41y6FOsEUvknqgXnGmJyJSbjhAWq5pO4F8= github.com/chai2010/gettext-go v0.0.0-20160711120539-c6fed771bfd5/go.mod h1:/iP1qXHoty45bqomnu2LM+VVyAEdWN+vtSHGlQgyxbw= +github.com/client9/misspell v0.3.4 h1:ta993UF76GwbvJcIo3Y68y/M3WxlpEHPWIGDkJYwzJI= github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw= +github.com/coreos/etcd v3.3.10+incompatible h1:jFneRYjIvLMLhDLCzuTuU4rSJUjRplcJQ7pD7MnhC04= github.com/coreos/etcd v3.3.10+incompatible/go.mod h1:uF7uidLiAD3TWHmW31ZFd/JWoc32PjwdhPthX9715RE= +github.com/coreos/go-etcd v2.0.0+incompatible h1:bXhRBIXoTm9BYHS3gE0TtQuyNZyeEMux2sDi4oo5YOo= github.com/coreos/go-etcd v2.0.0+incompatible/go.mod h1:Jez6KQU2B/sWsbdaef3ED8NzMklzPG4d5KIOhIy30Tk= +github.com/coreos/go-semver v0.2.0 h1:3Jm3tLmsgAYcjC+4Up7hJrFBPr+n7rAqYeSw/SZazuY= github.com/coreos/go-semver v0.2.0/go.mod h1:nnelYz7RCh+5ahJtPPxZlU+153eP4D4r3EedlOD2RNk= +github.com/coreos/go-systemd v0.0.0-20190321100706-95778dfbb74e h1:Wf6HqHfScWJN9/ZjdUKyjop4mf3Qdd+1TvvltAvM3m8= github.com/coreos/go-systemd v0.0.0-20190321100706-95778dfbb74e/go.mod h1:F5haX7vjVVG0kc13fIWeqUViNPyEJxv/OmvnBo0Yme4= +github.com/cpuguy83/go-md2man v1.0.10 h1:BSKMNlYxDvnunlTymqtgONjNnaRV1sTpcovwwjF22jk= github.com/cpuguy83/go-md2man v1.0.10/go.mod h1:SmD6nW6nTyfqj6ABTjUi3V3JVMnlJmwcJI5acqYI6dE= github.com/davecgh/go-spew v0.0.0-20151105211317-5215b55f46b2/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/daviddengcn/go-colortext v0.0.0-20160507010035-511bcaf42ccd h1:uVsMphB1eRx7xB1njzL3fuMdWRN8HtVzoUOItHMwv5c= github.com/daviddengcn/go-colortext v0.0.0-20160507010035-511bcaf42ccd/go.mod h1:dv4zxwHi5C/8AeI+4gX4dCWOIvNi7I6JCSX0HvlKPgE= +github.com/dgrijalva/jwt-go v3.2.0+incompatible h1:7qlOGliEKZXTDg6OTjfoBKDXWrumCAMpl/TFQ4/5kLM= github.com/dgrijalva/jwt-go v3.2.0+incompatible/go.mod h1:E3ru+11k8xSBh+hMPgOLZmtrrCbhqsmaPHjLKYnJCaQ= +github.com/docker/distribution v2.7.1+incompatible h1:a5mlkVzth6W5A4fOsS3D2EO5BUmsJpcB+cRlLU7cSug= github.com/docker/distribution v2.7.1+incompatible/go.mod h1:J2gT2udsDAN96Uj4KfcMRqY0/ypR+oyYUYmja8H+y+w= +github.com/docker/docker v0.7.3-0.20190327010347-be7ac8be2ae0 h1:w3NnFcKR5241cfmQU5ZZAsf0xcpId6mWOupTvJlUX2U= github.com/docker/docker v0.7.3-0.20190327010347-be7ac8be2ae0/go.mod h1:eEKB0N0r5NX/I1kEveEz05bcu8tLC/8azJZsviup8Sk= +github.com/docker/spdystream v0.0.0-20160310174837-449fdfce4d96 h1:cenwrSVm+Z7QLSV/BsnenAOcDXdX4cMv4wP0B/5QbPg= github.com/docker/spdystream v0.0.0-20160310174837-449fdfce4d96/go.mod h1:Qh8CwZgvJUkLughtfhJv5dyTYa91l1fOUCrgjqmcifM= +github.com/elazarl/goproxy v0.0.0-20170405201442-c4fc26588b6e h1:p1yVGRW3nmb85p1Sh1ZJSDm4A4iKLS5QNbvUHMgGu/M= github.com/elazarl/goproxy v0.0.0-20170405201442-c4fc26588b6e/go.mod h1:/Zj4wYkgs4iZTTu3o/KG3Itv/qCCa8VVMlb3i9OVuzc= github.com/emicklei/go-restful v0.0.0-20170410110728-ff4f55a20633/go.mod h1:otzb+WCGbkyDHkqmQmT5YD2WR4BBwUdeQoFo8l/7tVs= +github.com/emicklei/go-restful v2.9.5+incompatible h1:spTtZBk5DYEvbxMVutUuTyh1Ao2r4iyvLdACqsl/Ljk= github.com/emicklei/go-restful v2.9.5+incompatible/go.mod h1:otzb+WCGbkyDHkqmQmT5YD2WR4BBwUdeQoFo8l/7tVs= +github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473 h1:4cmBvAEBNJaGARUEs3/suWRyfyBfhf7I60WBZq+bv2w= github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymFceY/DCBVvsKhRF0iEA4= +github.com/envoyproxy/protoc-gen-validate v0.1.0 h1:EQciDnbrYxy13PgWoY8AqoxGiPrpgBZ1R8UNe3ddc+A= github.com/envoyproxy/protoc-gen-validate v0.1.0/go.mod h1:iSmxcyjqTsJpI2R4NaDN7+kN2VEUnK/pcBlmesArF7c= github.com/evanphx/json-patch v4.2.0+incompatible h1:fUDGZCv/7iAN7u0puUVhvKCcsR6vRfwrJatElLBEf0I= github.com/evanphx/json-patch v4.2.0+incompatible/go.mod h1:50XU6AFN0ol/bzJsmQLiYLvXMP4fmwYFNcr97nuDLSk= +github.com/exponent-io/jsonpath v0.0.0-20151013193312-d6023ce2651d h1:105gxyaGwCFad8crR9dcMQWvV9Hvulu6hwUh4tWPJnM= github.com/exponent-io/jsonpath v0.0.0-20151013193312-d6023ce2651d/go.mod h1:ZZMPRZwes7CROmyNKgQzC3XPs6L/G2EJLHddWejkmf4= +github.com/fatih/camelcase v1.0.0 h1:hxNvNX/xYBp0ovncs8WyWZrOrpBNub/JfaMvbURyft8= github.com/fatih/camelcase v1.0.0/go.mod h1:yN2Sb0lFhZJUdVvtELVWefmrXpuZESvPmqwoZc+/fpc= github.com/fsnotify/fsnotify v1.4.7 h1:IXs+QLmnXW2CcXuY+8Mzv/fWEsPGWxqefPtCP5CnV9I= github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo= github.com/ghodss/yaml v0.0.0-20150909031657-73d445a93680/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04= +github.com/ghodss/yaml v0.0.0-20180820084758-c7ce16629ff4 h1:bRzFpEzvausOAt4va+I/22BZ1vXDtERngp0BNYDKej0= github.com/ghodss/yaml v0.0.0-20180820084758-c7ce16629ff4/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04= github.com/go-kit/kit v0.8.0/go.mod h1:xBxKIO96dXMWWy0MnWVtmwkA9/13aqxPnvrjFYMA2as= +github.com/go-kit/kit v0.9.0 h1:wDJmvq38kDhkVxi50ni9ykkdUr1PKgqKOoi01fa0Mdk= github.com/go-kit/kit v0.9.0/go.mod h1:xBxKIO96dXMWWy0MnWVtmwkA9/13aqxPnvrjFYMA2as= github.com/go-logfmt/logfmt v0.3.0/go.mod h1:Qt1PoO58o5twSAckw1HlFXLmHsOX5/0LbT9GBnD5lWE= +github.com/go-logfmt/logfmt v0.4.0 h1:MP4Eh7ZCb31lleYCFuwm0oe4/YGak+5l1vA2NOE80nA= github.com/go-logfmt/logfmt v0.4.0/go.mod h1:3RMwSq7FuexP4Kalkev3ejPJsZTpXXBr9+V4qmtdjCk= +github.com/go-logr/logr v0.1.0 h1:M1Tv3VzNlEHg6uyACnRdtrploV2P7wZqH8BoQMtz0cg= github.com/go-logr/logr v0.1.0/go.mod h1:ixOQHD9gLJUVQQ2ZOR7zLEifBX6tGkNJF4QyIY7sIas= github.com/go-openapi/jsonpointer v0.0.0-20160704185906-46af16f9f7b1/go.mod h1:+35s3my2LFTysnkMfxsJBAMHj/DoqoB9knIWoYG/Vk0= github.com/go-openapi/jsonpointer v0.19.2 h1:A9+F4Dc/MCNB5jibxf6rRvOvR/iFgQdyNx9eIhnGqq0= @@ -80,18 +119,26 @@ github.com/go-openapi/jsonreference v0.0.0-20160704190145-13c6e3589ad9/go.mod h1 github.com/go-openapi/jsonreference v0.19.2 h1:o20suLFB4Ri0tuzpWtyHlh7E7HnkqTNLq6aR6WVNS1w= github.com/go-openapi/jsonreference v0.19.2/go.mod h1:jMjeRr2HHw6nAVajTXJ4eiUwohSTlpa0o73RUL1owJc= github.com/go-openapi/spec v0.0.0-20160808142527-6aced65f8501/go.mod h1:J8+jY1nAiCcj+friV/PDoE1/3eeccG9LYBs0tYvLOWc= +github.com/go-openapi/spec v0.19.2 h1:SStNd1jRcYtfKCN7R0laGNs80WYYvn5CbBjM2sOmCrE= github.com/go-openapi/spec v0.19.2/go.mod h1:sCxk3jxKgioEJikev4fgkNmwS+3kuYdJtcsZsD5zxMY= github.com/go-openapi/swag v0.0.0-20160704191624-1d0bd113de87/go.mod h1:DXUve3Dpr1UfpPtxFw+EFuQ41HhCWZfha5jSVRG7C7I= github.com/go-openapi/swag v0.19.2 h1:jvO6bCMBEilGwMfHhrd61zIID4oIFdwb76V17SM88dE= github.com/go-openapi/swag v0.19.2/go.mod h1:POnQmlKehdgb5mhVOsnJFsivZCEZ/vjK9gh66Z9tfKk= +github.com/go-sql-driver/mysql v1.5.0 h1:ozyZYNQW3x3HtqT1jira07DN2PArx2v7/mN66gGcHOs= github.com/go-sql-driver/mysql v1.5.0/go.mod h1:DCzpHaOWr8IXmIStZouvnhqoel9Qv2LBy8hT2VhHyBg= +github.com/go-stack/stack v1.8.0 h1:5SgMzNM5HxrEjV0ww2lTmX6E2Izsfxas4+YHWRs3Lsk= github.com/go-stack/stack v1.8.0/go.mod h1:v0f6uXyyMGvRgIKkXu+yp6POWl0qKG85gN/melR3HDY= github.com/gogo/protobuf v1.1.1/go.mod h1:r8qH/GZQm5c6nD/R0oafs1akxWv10x8SbQlK7atdtwQ= github.com/gogo/protobuf v1.2.2-0.20190723190241-65acae22fc9d h1:3PaI8p3seN09VjbTYC/QWlUZdZ1qS1zGjy7LH2Wt07I= github.com/gogo/protobuf v1.2.2-0.20190723190241-65acae22fc9d/go.mod h1:SlYgWuQ5SjCEi6WLHjHCa1yvBfUnHcTbrrZtXPKa29o= +github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b h1:VKtxabqXZkF25pY9ekfRL6a582T4P37/31XEstQ5p58= github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b/go.mod h1:SBH7ygxi8pfUlaOkMMuAQtPIUF8ecWP5IEl/CR7VP2Q= +github.com/golang/groupcache v0.0.0-20160516000752-02826c3e7903 h1:LbsanbbD6LieFkXbj9YNNBupiGHJgFeLpO0j0Fza1h8= github.com/golang/groupcache v0.0.0-20160516000752-02826c3e7903/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc= +github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da h1:oI5xCqsCo564l8iNU+DwB5epxmsaqB+rhGL0m5jtYqE= +github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da/go.mod h1:cIg4eruTrX1D+g88fzRXU5OdNfaM+9IcxsU14FzY7Hc= github.com/golang/mock v1.1.1/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A= +github.com/golang/mock v1.2.0 h1:28o5sBqPkBsMGnC6b4MvE2TzSr5/AT4c/1fLqVGIwlk= github.com/golang/mock v1.2.0/go.mod h1:oTYuIxOrZwtPieC+H1uAHpcLFnEyAGVDL/k47Jfbm0A= github.com/golang/protobuf v0.0.0-20161109072736-4bd1920723d7/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= github.com/golang/protobuf v1.2.0 h1:P3YflyNX/ehuJFLhxviNdFxQPkGK5cDcApsge1SqnvM= @@ -102,9 +149,13 @@ github.com/golang/protobuf v1.3.2 h1:6nsPYzhq5kReh6QImI3k5qWzO4PEbvbIW2cwSfR/6xs github.com/golang/protobuf v1.3.2/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U= github.com/golang/protobuf v1.3.4 h1:87PNWwrRvUSnqS4dlcBU/ftvOIBep4sYuBLlh6rX2wk= github.com/golang/protobuf v1.3.4/go.mod h1:vzj43D7+SQXF/4pzW/hwtAqwc6iTitCiVSaWz5lYuqw= +github.com/golangplus/bytes v0.0.0-20160111154220-45c989fe5450 h1:7xqw01UYS+KCI25bMrPxwNYkSns2Db1ziQPpVq99FpE= github.com/golangplus/bytes v0.0.0-20160111154220-45c989fe5450/go.mod h1:Bk6SMAONeMXrxql8uvOKuAZSu8aM5RUGv+1C6IJaEho= +github.com/golangplus/fmt v0.0.0-20150411045040-2a5d6d7d2995 h1:f5gsjBiF9tRRVomCvrkGMMWI8W1f2OBFar2c5oakAP0= github.com/golangplus/fmt v0.0.0-20150411045040-2a5d6d7d2995/go.mod h1:lJgMEyOkYFkPcDKwRXegd+iM6E7matEszMG5HhwytU8= +github.com/golangplus/testing v0.0.0-20180327235837-af21d9c3145e h1:KhcknUwkWHKZPbFy2P7jH5LKJ3La+0ZeknkkmrSgqb0= github.com/golangplus/testing v0.0.0-20180327235837-af21d9c3145e/go.mod h1:0AA//k/eakGydO4jKRoRL2j92ZKSzTgj9tclaCrvXHk= +github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c h1:964Od4U6p2jUkFxvCydnIczKteheJEzHRToSGK3Bnlw= github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ= github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M= github.com/google/go-cmp v0.3.0 h1:crn/baboCvb5fXaQ0IJ1SGTsTVrWpDsCWC8EGETZijY= @@ -115,25 +166,34 @@ github.com/google/go-cmp v0.4.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/ github.com/google/gofuzz v0.0.0-20161122191042-44d81051d367/go.mod h1:HP5RmnzzSNb993RKQDq4+1A4ia9nllfqcQFTQJedwGI= github.com/google/gofuzz v1.0.0 h1:A8PeW59pxE9IoFRqBp37U+mSNaQoZ46F1f0f863XSXw= github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= +github.com/google/martian v2.1.0+incompatible h1:/CP5g8u/VJHijgedC/Legn3BAbAaWPgecwXBIDzw5no= github.com/google/martian v2.1.0+incompatible/go.mod h1:9I4somxYTbIHy5NJKHRl3wXiIaQGbYVAs8BPL6v8lEs= +github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57 h1:eqyIo2HjKhKe/mJzTG8n4VqvLXIOEG+SLdDqX7xGtkY= github.com/google/pprof v0.0.0-20181206194817-3ea8567a2e57/go.mod h1:zfwlbNMJ+OItoe0UupaVj+oy1omPYYDuagoSzA8v9mc= github.com/google/uuid v1.1.1 h1:Gkbcsh/GbpXz7lPftLA3P6TYMwjCLYm83jiFQZF/3gY= github.com/google/uuid v1.1.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= +github.com/googleapis/gax-go/v2 v2.0.4 h1:hU4mGcQI4DaAYW+IbTun+2qEZVFxK0ySjQLTbS0VQKc= github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg= github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d h1:7XGaL1e6bYS1yIonGp9761ExpPPV1ui0SAC59Yube9k= github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d/go.mod h1:sJBsCZ4ayReDTBIg8b9dl28c5xFWyhBTVRp3pOg5EKY= +github.com/gophercloud/gophercloud v0.1.0 h1:P/nh25+rzXouhytV2pUHBb65fnds26Ghl8/391+sT5o= github.com/gophercloud/gophercloud v0.1.0/go.mod h1:vxM41WHh5uqHVBMZHzuwNOHh8XEoIEcSTewFxm1c5g8= +github.com/gregjones/httpcache v0.0.0-20170728041850-787624de3eb7 h1:6TSoaYExHper8PYsJu23GWVNOyYRCSnIFyxKgLSZ54w= github.com/gregjones/httpcache v0.0.0-20170728041850-787624de3eb7/go.mod h1:FecbI9+v66THATjSRHfNgh1IVFe/9kFxbXtjV0ctIMA= github.com/hashicorp/golang-lru v0.5.0/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8= github.com/hashicorp/golang-lru v0.5.1 h1:0hERBMJE1eitiLkihrMvRVBYAkpHzc/J3QdDN+dAcgU= github.com/hashicorp/golang-lru v0.5.1/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8= +github.com/hashicorp/hcl v1.0.0 h1:0Anlzjpi4vEasTeNFn2mLJgTSwt0+6sfsiTG8qcWGx4= github.com/hashicorp/hcl v1.0.0/go.mod h1:E5yfLk+7swimpb2L/Alb/PJmXilQ/rhwaUYs4T20WEQ= github.com/hpcloud/tail v1.0.0 h1:nfCOvKYfkgYP8hkirhJocXT2+zOD8yUNjXaWfTlyFKI= github.com/hpcloud/tail v1.0.0/go.mod h1:ab1qPbhIpdTxEkNHXyeSf5vhxWSCs/tWer42PpOxQnU= +github.com/imdario/mergo v0.3.5 h1:JboBksRwiiAJWvIYJVo46AfV+IAIKZpfrSzVKj42R4Q= github.com/imdario/mergo v0.3.5/go.mod h1:2EnlNZ0deacrJVfApfmtdGgDfMuh/nq6Ok1EcJh5FfA= +github.com/inconshreveable/mousetrap v1.0.0 h1:Z8tu5sraLXCXIcARxBp/8cbvlwVa7Z1NHg9XEKhtSvM= github.com/inconshreveable/mousetrap v1.0.0/go.mod h1:PxqpIevigyE2G7u3NXJIT2ANytuPF1OarO4DADm73n8= github.com/jmespath/go-jmespath v0.3.0 h1:OS12ieG61fsCg5+qLJ+SsW9NicxNkg3b25OyT2yCeUc= github.com/jmespath/go-jmespath v0.3.0/go.mod h1:9QtRXoHjLGCJ5IBSaohpXITPlowMeeYCZ7fLUTSywik= +github.com/jonboulle/clockwork v0.1.0 h1:VKV+ZcuP6l3yW9doeqz6ziZGgcynBVQO+obU0+0hcPo= github.com/jonboulle/clockwork v0.1.0/go.mod h1:Ii8DK3G1RaLaWxj9trq07+26W01tbo22gdxWY5EU2bo= github.com/json-iterator/go v0.0.0-20180612202835-f2b4162afba3/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCVDaaPEHmU= github.com/json-iterator/go v1.1.6 h1:MrUvLMLTMxbqFJ9kzlvat/rYZqZnW3u4wkLzWTaFwKs= @@ -141,27 +201,41 @@ github.com/json-iterator/go v1.1.6/go.mod h1:+SdeFBvtyEkXs7REEP0seUULqWtbJapLOCV github.com/json-iterator/go v1.1.7/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4= github.com/json-iterator/go v1.1.9 h1:9yzud/Ht36ygwatGx56VwCZtlI/2AD15T1X2sjSuGns= github.com/json-iterator/go v1.1.9/go.mod h1:KdQUCv79m/52Kvf8AW2vK1V8akMuk1QjK/uOdHXbAo4= +github.com/jstemmer/go-junit-report v0.0.0-20190106144839-af01ea7f8024 h1:rBMNdlhTLzJjJSDIjNEXX1Pz3Hmwmz91v+zycvx9PJc= github.com/jstemmer/go-junit-report v0.0.0-20190106144839-af01ea7f8024/go.mod h1:6v2b51hI/fHJwM22ozAgKL4VKDeJcHhJFhtBdhmNjmU= +github.com/julienschmidt/httprouter v1.2.0 h1:TDTW5Yz1mjftljbcKqRcrYhd4XeOoI98t+9HbQbYf7g= github.com/julienschmidt/httprouter v1.2.0/go.mod h1:SYymIcj16QtmaHHD7aYtjjsJG7VTCxuUUipMqKk8s4w= +github.com/kisielk/errcheck v1.2.0 h1:reN85Pxc5larApoH1keMBiu2GWtPqXQ1nc9gx+jOU+E= github.com/kisielk/errcheck v1.2.0/go.mod h1:/BMXB+zMLi60iA8Vv6Ksmxu/1UDYcXs4uQLJ+jE2L00= +github.com/kisielk/gotool v1.0.0 h1:AV2c/EiW3KqPNT9ZKl07ehoAGi4C5/01Cfbblndcapg= github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck= +github.com/konsorten/go-windows-terminal-sequences v1.0.1 h1:mweAR1A6xJ3oS2pRaGiHgQ4OO8tzTaLawm8vnODuwDk= github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= +github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515 h1:T+h1c/A9Gawja4Y9mFVWj2vyii2bbUNDw3kt9VxK2EY= github.com/kr/logfmt v0.0.0-20140226030751-b84e30acd515/go.mod h1:+0opPa2QZZtGFBFZlji/RkVcI2GknAs/DXo4wKdlNEc= github.com/kr/pretty v0.1.0 h1:L/CwN0zerZDmRFUapSPitk6f+Q3+0za1rQkzVuMiMFI= github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo= github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= +github.com/kr/pty v1.1.5 h1:hyz3dwM5QLc1Rfoz4FuWJQG5BN7tc6K1MndAUnGpQr4= github.com/kr/pty v1.1.5/go.mod h1:9r2w37qlBe7rQ6e1fg1S/9xpWHSnaqNdHD3WcMdbPDA= github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE= github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= +github.com/liggitt/tabwriter v0.0.0-20181228230101-89fcab3d43de h1:9TO3cAIGXtEhnIaL+V+BEER86oLrvS+kWobKpbJuye0= github.com/liggitt/tabwriter v0.0.0-20181228230101-89fcab3d43de/go.mod h1:zAbeS9B/r2mtpb6U+EI2rYA5OAXxsYw6wTamcNW+zcE= +github.com/lithammer/dedent v1.1.0 h1:VNzHMVCBNG1j0fh3OrsFRkVUwStdDArbgBWoPAffktY= github.com/lithammer/dedent v1.1.0/go.mod h1:jrXYCQtgg0nJiN+StA2KgR7w6CiQNv9Fd/Z9BP0jIOc= +github.com/magiconair/properties v1.8.0 h1:LLgXmsheXeRoUOBOjtwPQCWIYqM/LU1ayDtDePerRcY= github.com/magiconair/properties v1.8.0/go.mod h1:PppfXfuXeibc/6YijjN8zIbojt8czPbwD3XqdrwzmxQ= github.com/mailru/easyjson v0.0.0-20160728113105-d5b7844b561a/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc= +github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63 h1:nTT4s92Dgz2HlrB2NaMgvlfqHH39OgMhA7z3PK7PGD4= github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc= github.com/matttproud/golang_protobuf_extensions v1.0.1 h1:4hp9jkHxhMHkqkrB3Ix0jegS5sx/RkqARlsWZ6pIwiU= github.com/matttproud/golang_protobuf_extensions v1.0.1/go.mod h1:D8He9yQNgCq6Z5Ld7szi9bcBfOoFv/3dc6xSMkL2PC0= +github.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y= github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0= +github.com/mitchellh/go-wordwrap v1.0.0 h1:6GlHJ/LTGMrIJbwgdqdl2eEH8o+Exx/0m8ir9Gns0u4= github.com/mitchellh/go-wordwrap v1.0.0/go.mod h1:ZXFpozHsX6DPmq2I0TCekCxypsnAUbP2oI0UX1GXzOo= +github.com/mitchellh/mapstructure v1.1.2 h1:fmNYVwqnSfB9mZU6OS2O6GsXM+wcskZDuKQzvN1EDeE= github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y= github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg= @@ -170,8 +244,11 @@ github.com/modern-go/reflect2 v0.0.0-20180320133207-05fbef0ca5da/go.mod h1:bx2lN github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0= github.com/modern-go/reflect2 v1.0.1 h1:9f412s+6RmYXLWZSEzVVgPGK7C2PphHj5RJrvfx9AWI= github.com/modern-go/reflect2 v1.0.1/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0= +github.com/munnerz/goautoneg v0.0.0-20120707110453-a547fc61f48d h1:7PxY7LVfSZm7PEeBTyK1rj1gABdCO2mbri6GKO1cMDs= github.com/munnerz/goautoneg v0.0.0-20120707110453-a547fc61f48d/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ= +github.com/mwitkow/go-conntrack v0.0.0-20161129095857-cc309e4a2223 h1:F9x/1yl3T2AeKLr2AMdilSD8+f9bvMnNN8VS5iDtovc= github.com/mwitkow/go-conntrack v0.0.0-20161129095857-cc309e4a2223/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U= +github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f h1:y5//uYreIhSUg3J1GEMiLbxo1LJaP8RfCpH6pymGZus= github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f/go.mod h1:ZdcZmHo+o7JKHSa8/e818NopupXU1YMK5fe1lsApnBw= github.com/onsi/ginkgo v0.0.0-20170829012221-11459a886d9c/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE= github.com/onsi/ginkgo v1.6.0/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE= @@ -180,13 +257,18 @@ github.com/onsi/ginkgo v1.10.1/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+ github.com/onsi/gomega v0.0.0-20170829124025-dcabb60a477c/go.mod h1:C1qb7wdrVGGVU+Z6iS04AVkA3Q65CEZX59MT0QO5uiA= github.com/onsi/gomega v1.7.0 h1:XPnZz8VVBHjVsy1vzJmRwIcSwiUO+JFfrv/xGiigmME= github.com/onsi/gomega v1.7.0/go.mod h1:ex+gbHU/CVuBBDIJjb2X0qEXbFg53c61hWP/1CpauHY= +github.com/opencontainers/go-digest v1.0.0-rc1 h1:WzifXhOVOEOuFYOJAW6aQqW0TooG2iki3E3Ii+WN7gQ= github.com/opencontainers/go-digest v1.0.0-rc1/go.mod h1:cMLVZDEM3+U2I4VmLI6N8jQYUd2OVphdqWwCJHrFt2s= +github.com/opentracing/opentracing-go v1.1.1-0.20190913142402-a7454ce5950e h1:fI6mGTyggeIYVmGhf80XFHxTupjOexbCppgTNDkv9AA= github.com/opentracing/opentracing-go v1.1.1-0.20190913142402-a7454ce5950e/go.mod h1:UkNAQd3GIcIGf0SeVgPpRdFStlNbqXla1AfSYxPUl2o= +github.com/pelletier/go-toml v1.2.0 h1:T5zMGML61Wp+FlcbWjRDT7yAxhJNAiPPLOFECq181zc= github.com/pelletier/go-toml v1.2.0/go.mod h1:5z9KED0ma1S8pY6P1sdut58dfprrGBbd/94hg7ilaic= +github.com/peterbourgon/diskv v2.0.1+incompatible h1:UBdAOUP5p4RWqPBg048CAvpKN+vxiaj6gdUUzhl4XmI= github.com/peterbourgon/diskv v2.0.1+incompatible/go.mod h1:uqqh8zWWbv1HBMNONnaR/tNboyR3/BZd58JJSHlUSCU= github.com/pkg/errors v0.8.0/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= github.com/pkg/errors v0.8.1 h1:iURUrRGxPUNPdy5/HRSm+Yj6okJ6UtLINN0Q9M4+h3I= github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= +github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4= github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= github.com/pmezard/go-difflib v0.0.0-20151028094244-d8ed2627bdf0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= @@ -213,26 +295,34 @@ github.com/prometheus/procfs v0.0.2/go.mod h1:TjEm7ze935MbeOT/UhFTIMYKhuLP4wbCsT github.com/prometheus/procfs v0.0.8/go.mod h1:7Qr8sr6344vo1JqZ6HhLceV9o3AJ1Ff+GxbHq6oeK9A= github.com/prometheus/procfs v0.0.10 h1:QJQN3jYQhkamO4mhfUWqdDH2asK7ONOI9MTWjyAxNKM= github.com/prometheus/procfs v0.0.10/go.mod h1:7Qr8sr6344vo1JqZ6HhLceV9o3AJ1Ff+GxbHq6oeK9A= +github.com/remyoudompheng/bigfft v0.0.0-20170806203942-52369c62f446 h1:/NRJ5vAYoqz+7sG51ubIDHXeWO8DlTSrToPu6q11ziA= github.com/remyoudompheng/bigfft v0.0.0-20170806203942-52369c62f446/go.mod h1:uYEyJGbgTkfkS4+E/PavXkNJcbFIpEtjt2B0KDQ5+9M= +github.com/rs/xid v1.2.1 h1:mhH9Nq+C1fY2l1XIpgxIiUOfNpRBYH1kKcr+qfKgjRc= github.com/rs/xid v1.2.1/go.mod h1:+uKXf+4Djp6Md1KODXJxgGQPKngRmWyn10oCKFzNHOQ= github.com/rs/zerolog v1.18.0 h1:CbAm3kP2Tptby1i9sYy2MGRg0uxIN9cyDb59Ys7W8z8= github.com/rs/zerolog v1.18.0/go.mod h1:9nvC1axdVrAHcu/s9taAVfBuIdTZLVQmKQyvrUjF5+I= +github.com/russross/blackfriday v1.5.2 h1:HyvC0ARfnZBqnXwABFeSZHpKvJHJJfPz81GNueLj0oo= github.com/russross/blackfriday v1.5.2/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR/rfWxYHBV53g= github.com/sirupsen/logrus v1.2.0/go.mod h1:LxeOpSwHxABJmUn/MG1IvRgCAasNZTLOkJPxbbu5VWo= github.com/sirupsen/logrus v1.4.2 h1:SPIRibHv4MatM3XXNO2BJeFLZwZ2LvZgfQ5+UNI2im4= github.com/sirupsen/logrus v1.4.2/go.mod h1:tLMulIdttU9McNUspp0xgXVQah82FyeX6MwdIuYE2rE= github.com/spf13/afero v1.1.2/go.mod h1:j4pytiNVoe2o6bmDsKpLACNPDBIoEAkihy7loJ1B0CQ= +github.com/spf13/afero v1.2.2 h1:5jhuqJyZCZf2JRofRvN/nIFgIWNzPa3/Vz8mYylgbWc= github.com/spf13/afero v1.2.2/go.mod h1:9ZxEEn6pIJ8Rxe320qSDBk6AsU0r9pR7Q4OcevTdifk= +github.com/spf13/cast v1.3.0 h1:oget//CVOEoFewqQxwr0Ej5yjygnqGkvggSE/gB35Q8= github.com/spf13/cast v1.3.0/go.mod h1:Qx5cxh0v+4UWYiBimWS+eyWzqEqokIECu5etghLkUJE= github.com/spf13/cobra v0.0.5 h1:f0B+LkLX6DtmRH1isoNA9VTtNUK9K8xYd28JNNfOv/s= github.com/spf13/cobra v0.0.5/go.mod h1:3K3wKZymM7VvHMDS9+Akkh4K60UwM26emMESw8tLCHU= +github.com/spf13/jwalterweatherman v1.0.0 h1:XHEdyB+EcvlqZamSM4ZOMGlc93t6AcsBEu9Gc1vn7yk= github.com/spf13/jwalterweatherman v1.0.0/go.mod h1:cQK4TGJAtQXfYWX+Ddv3mKDzgVb68N+wFjFa4jdeBTo= github.com/spf13/pflag v0.0.0-20170130214245-9ff6c6923cff/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= github.com/spf13/pflag v1.0.3 h1:zPAT6CGy6wXeQ7NtTnaTerfKOsV6V6F8agHXFiazDkg= github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= +github.com/spf13/viper v1.3.2 h1:VUFqw5KcqRf7i70GOzW7N+Q7+gxVBkSSqiXB12+JQ4M= github.com/spf13/viper v1.3.2/go.mod h1:ZiWeW+zYFKm7srdB9IoDzzZXaJaI5eL9QjNiN/DMA2s= github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= github.com/stretchr/objx v0.1.1/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= +github.com/stretchr/objx v0.2.0 h1:Hbg2NidpLE8veEBkEZTL3CvlkUIVzuU9jDplZO54c48= github.com/stretchr/objx v0.2.0/go.mod h1:qt09Ya8vawLte6SNmTgCsAVtYtaKzEcn8ATUoHMkEqE= github.com/stretchr/testify v0.0.0-20151208002404-e3a8ff8ce365/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs= github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs= @@ -242,9 +332,13 @@ github.com/stretchr/testify v1.4.0 h1:2E4SXV/wtOkTonXsotYi4li6zVWxYlZuYNCXe9XRJy github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4= github.com/stretchr/testify v1.5.1 h1:nOGnQDM7FYENwehXlg/kFVnos3rEvtKTjRvOWSzb6H4= github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA= +github.com/ugorji/go/codec v0.0.0-20181204163529-d75b2dcb6bc8 h1:3SVOIvH7Ae1KRYyQWRjXWJEA9sS/c/pjvH++55Gr648= github.com/ugorji/go/codec v0.0.0-20181204163529-d75b2dcb6bc8/go.mod h1:VFNgLljTbGfSG7qAOspJ7OScBnGdDN/yBr0sguwnwf0= +github.com/xordataexchange/crypt v0.0.3-0.20170626215501-b2862e3d0a77 h1:ESFSdwYZvkeru3RtdrYueztKhOBCSAAzS4Gf+k0tEow= github.com/xordataexchange/crypt v0.0.3-0.20170626215501-b2862e3d0a77/go.mod h1:aYKd//L2LvnjZzWKhF00oedf4jCCReLcmhLdhm1A27Q= +github.com/zenazn/goji v0.9.0 h1:RSQQAbXGArQ0dIDEq+PI6WqN6if+5KHu6x2Cx/GXLTQ= github.com/zenazn/goji v0.9.0/go.mod h1:7S9M489iMyHBNxwZnk9/EHS098H4/F6TATF2mIxtB1Q= +go.opencensus.io v0.21.0 h1:mU6zScU4U1YAFPHEHYk+3JC4SY7JxgkqS10ZOSyksNg= go.opencensus.io v0.21.0/go.mod h1:mSImk1erAIZhrmZN+AvHh14ztQfjbGwt4TtuofqLduU= go.opentelemetry.io/contrib/instrumentation/runtime v0.6.1 h1:RqzQbnsxKLWaUaPGJm2nnkQF7dr1B98oYCrnkO9HOxM= go.opentelemetry.io/contrib/instrumentation/runtime v0.6.1/go.mod h1:LzbC4WCSSbnZPWOF4bVCE8wyazGzXw+luY9Fowzpkoo= @@ -261,12 +355,16 @@ golang.org/x/crypto v0.0.0-20190611184440-5c40567a22f8 h1:1wopBVtVdWnn03fZelqdXT golang.org/x/crypto v0.0.0-20190611184440-5c40567a22f8/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI= golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= golang.org/x/exp v0.0.0-20190125153040-c74c464bbbf2/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= +golang.org/x/exp v0.0.0-20190312203227-4b39c73a6495 h1:I6A9Ag9FpEKOjcKrRNjQkPHawoXIhKyTGfvvjFAiiAk= golang.org/x/exp v0.0.0-20190312203227-4b39c73a6495/go.mod h1:ZjyILWgesfNpC6sMxTJOJm9Kp84zZh5NQWvqDGG3Qr8= +golang.org/x/image v0.0.0-20190227222117-0694c2d4d067 h1:KYGJGHOQy8oSi1fDlSpcZF0+juKwk/hEMv5SiwHogR0= golang.org/x/image v0.0.0-20190227222117-0694c2d4d067/go.mod h1:kZ7UVZpmo3dzQBMxlp+ypCbDeSB+sBbTgSJuh5dn5js= golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= golang.org/x/lint v0.0.0-20190227174305-5b3e6a55c961/go.mod h1:wehouNa3lNwaWXcvxsM5YxQ5yQlVC4a0KAMCusXpPoU= golang.org/x/lint v0.0.0-20190301231843-5614ed5bae6f/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= +golang.org/x/lint v0.0.0-20190313153728-d0100b6bd8b3 h1:XQyxROzUlZH+WIQwySDgnISgOivlhjIEwaQaJEJrrN0= golang.org/x/lint v0.0.0-20190313153728-d0100b6bd8b3/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc= +golang.org/x/mobile v0.0.0-20190312151609-d3739f865fa6 h1:Tus/Y4w3V77xDsGwKUC8a/QrV7jScpU557J77lFffNs= golang.org/x/mobile v0.0.0-20190312151609-d3739f865fa6/go.mod h1:z+o9i4GpDbdi3rU15maQ/Ox0txvL9dWGYEHz965HBQE= golang.org/x/net v0.0.0-20170114055629-f2499483f923/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= @@ -293,6 +391,7 @@ golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJ golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20190227155943-e225da77a7e6/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= +golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e h1:vcxGaoTs7kV8m5Np9uUNQin4BrLOthgV7252N8V+FwY= golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sys v0.0.0-20170830134202-bb24a47a89ea/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= @@ -331,13 +430,17 @@ golang.org/x/tools v0.0.0-20190312170243-e65039ee4138/go.mod h1:LCzVGOaR6xXOjkQ3 golang.org/x/tools v0.0.0-20190524140312-2c0ae7006135/go.mod h1:RgjU9mgBXZiqYHBnxXauZ1Gv1EHHAz9KjViQ78xBX0Q= golang.org/x/tools v0.0.0-20190614205625-5aca471b1d59/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc= golang.org/x/tools v0.0.0-20190621195816-6e04913cbbac/go.mod h1:/rFqwRUd4F7ZHNgwSSTFct+R/Kf4OFW1sUzUTQQTgfc= +golang.org/x/tools v0.0.0-20190828213141-aed303cbaa74 h1:4cFkmztxtMslUX2SctSl+blCyXfpzhGOy9LhKAqSMA4= golang.org/x/tools v0.0.0-20190828213141-aed303cbaa74/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543 h1:E7g+9GITq07hpfrRu66IVDexMakfv52eLZ2CXBWiKr4= golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= +gonum.org/v1/gonum v0.0.0-20190331200053-3d26580ed485 h1:OB/uP/Puiu5vS5QMRPrXCDWUPb+kt8f1KW8oQzFejQw= gonum.org/v1/gonum v0.0.0-20190331200053-3d26580ed485/go.mod h1:2ltnJ7xHfj0zHS40VVPYEAAMTa3ZGguvHGBSJeRWqE0= gonum.org/v1/netlib v0.0.0-20190313105609-8cb42192e0e0/go.mod h1:wa6Ws7BG/ESfp6dHfk7C6KdzKA7wR7u/rKwOGE66zvw= +gonum.org/v1/netlib v0.0.0-20190331212654-76723241ea4e h1:jRyg0XfpwWlhEV8mDfdNGBeSJM2fuyh9Yjrnd8kF2Ts= gonum.org/v1/netlib v0.0.0-20190331212654-76723241ea4e/go.mod h1:kS+toOQn6AQKjmKJ7gzohV1XkqsFehRA2FbsbkopSuQ= +google.golang.org/api v0.4.0 h1:KKgc1aqhV8wDPbDzlDtpvyjZFY3vjz85FP7p4wcQUyI= google.golang.org/api v0.4.0/go.mod h1:8k5glujaEP+g9n7WNsDg8QP6cUVNI86fCNMcbazEtwE= google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM= google.golang.org/appengine v1.4.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4= @@ -353,6 +456,7 @@ google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZi google.golang.org/grpc v1.23.0/go.mod h1:Y5yQAOtifL1yxbo5wqy6BxZv8vAUGQwXBOALyacEbxg= google.golang.org/grpc v1.27.1 h1:zvIju4sqAGvwKspUQOhwnpcqSbzi7/H6QomNNjTL4sk= google.golang.org/grpc v1.27.1/go.mod h1:qbnxyOmOxrQa7FizSgH+ReBfzJrCY1pSN7KXBS8abTk= +gopkg.in/alecthomas/kingpin.v2 v2.2.6 h1:jMFz6MfLP0/4fUyZle81rXUoxOBFi19VUFKVDOQfozc= gopkg.in/alecthomas/kingpin.v2 v2.2.6/go.mod h1:FMv+mEhP44yOT+4EoQTLFTRgOQ1FBLkstjWtayDeSgw= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 h1:qIbj1fsPNlZgppZ+VLlY7N33q108Sa+fhmuc+sWQYwY= @@ -373,9 +477,11 @@ gopkg.in/yaml.v2 v2.2.4/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= gopkg.in/yaml.v2 v2.2.5/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= gopkg.in/yaml.v2 v2.2.7 h1:VUgggvou5XRW9mHwD/yXxIYSMtY0zoKQf/v226p2nyo= gopkg.in/yaml.v2 v2.2.7/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= +gotest.tools v2.2.0+incompatible h1:VsBPFP1AI068pPrMxtb/S8Zkgf9xEmTLJjfM+P5UIEo= gotest.tools v2.2.0+incompatible/go.mod h1:DsYFclhRJ6vuDpmuTbkuFWG+y2sxOXAzmJt81HFBacw= honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= honnef.co/go/tools v0.0.0-20190106161140-3f1c8253044a/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= +honnef.co/go/tools v0.0.0-20190523083050-ea95bdfd59fc h1:/hemPrYIhOhy8zYrNj+069zDB68us2sMGsfkFJO0iZs= honnef.co/go/tools v0.0.0-20190523083050-ea95bdfd59fc/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= k8s.io/api v0.0.0-20191010143144-fbf594f18f80 h1:ea1M6YTpnYsiQ7jLIzUHJLBa1Md7VF5+RCQvzSzAfVw= k8s.io/api v0.0.0-20191010143144-fbf594f18f80/go.mod h1:X3kixOyiuC4u4LU6y2BxLg5tsvw+hrMhstfga7LZ4Gw= @@ -383,12 +489,16 @@ k8s.io/apimachinery v0.0.0-20191006235458-f9f2f3f8ab02/go.mod h1:92mWDd8Ji2sw215 k8s.io/apimachinery v0.0.0-20191014065749-fb3eea214746/go.mod h1:92mWDd8Ji2sw2157KIgino5wCxffA8KSvhW2oY4ypdw= k8s.io/apimachinery v0.0.0-20191016060620-86f2f1b9c076 h1:8L9u97FbRRynWV0wgPJSFeh88zw8/x4hwnZ+lLg7X5s= k8s.io/apimachinery v0.0.0-20191016060620-86f2f1b9c076/go.mod h1:92mWDd8Ji2sw2157KIgino5wCxffA8KSvhW2oY4ypdw= +k8s.io/cli-runtime v0.0.0-20191016113839-5e0efc75cd33 h1:s8eC5DEF6orlWp1sSSjA78WKQcOBEeogIT6LiNcx7+k= k8s.io/cli-runtime v0.0.0-20191016113839-5e0efc75cd33/go.mod h1:t0eCH0uXM/ai4gBeT0RS+cojWdhCJtVS7yQ3dA8OS1s= k8s.io/client-go v0.0.0-20191014070654-bd505ee787b2 h1:nf5MSSvA0bhvqyPVBMEuVakFvMBDpCtnwrADN9IE0DY= k8s.io/client-go v0.0.0-20191014070654-bd505ee787b2/go.mod h1:ltlqKktkJqApstv/eMdrdx6j1zT39NpwqlWAH747iSM= +k8s.io/code-generator v0.0.0-20191003035328-700b1226c0bd h1:5WjZ3cIbClYC5mJf+H/ODCo36y8rRqtZRxol4Ujln8c= k8s.io/code-generator v0.0.0-20191003035328-700b1226c0bd/go.mod h1:HC9p4y3SBN+txSs8x57qmNPXFZ/CxdCHiDTNnocCSEw= +k8s.io/component-base v0.0.0-20191016230640-d338b9159fb6 h1:3dj2owWAuiCd+PHDh8B8erG9k2smkDnXztFhbQy3O5I= k8s.io/component-base v0.0.0-20191016230640-d338b9159fb6/go.mod h1:L2lcIF6P6N33EyqL0ntnoBvJ6t724ev4LzCc0yjn26g= k8s.io/gengo v0.0.0-20190128074634-0689ccc1d7d6/go.mod h1:ezvh/TsK7cY6rbqRK0oQQ8IAqLxYwwyPxAX1Pzy0ii0= +k8s.io/gengo v0.0.0-20190822140433-26a664648505 h1:ZY6yclUKVbZ+SdWnkfY+Je5vrMpKOxmGeKRbsXVmqYM= k8s.io/gengo v0.0.0-20190822140433-26a664648505/go.mod h1:ezvh/TsK7cY6rbqRK0oQQ8IAqLxYwwyPxAX1Pzy0ii0= k8s.io/klog v0.0.0-20181102134211-b9b56d5dfc92/go.mod h1:Gq+BEi5rUBO/HRz0bTSXDUcqjScdoY3a9IHpCEIOOfk= k8s.io/klog v0.3.0/go.mod h1:Gq+BEi5rUBO/HRz0bTSXDUcqjScdoY3a9IHpCEIOOfk= @@ -398,16 +508,23 @@ k8s.io/kube-openapi v0.0.0-20190816220812-743ec37842bf h1:EYm5AW/UUDbnmnI+gK0TJD k8s.io/kube-openapi v0.0.0-20190816220812-743ec37842bf/go.mod h1:1TqjTSzOxsLGIKfj0lK8EeCP7K1iUG65v09OM0/WG5E= k8s.io/kubectl v0.0.0-20191016234702-5d0b8f240400 h1:IeGIAmvzKqd9LLSWCXXFKbenbJCu9faeIulMgYz77YY= k8s.io/kubectl v0.0.0-20191016234702-5d0b8f240400/go.mod h1:939lYu2PWLrk5t7xWN3amNIvJlJCa7KQfPZ2+cLrnJA= +k8s.io/metrics v0.0.0-20191014074242-8b0351268f72 h1:n1vuALz3bRoMLogQDbTRxN6rCWUWIdTfPU0qf3Y1duo= k8s.io/metrics v0.0.0-20191014074242-8b0351268f72/go.mod h1:ie2c8bq97BFtf7noiNVVJmLhEjShRhE4KBVFxeZCSjs= k8s.io/utils v0.0.0-20191010214722-8d271d903fe4 h1:Gi+/O1saihwDqnlmC8Vhv1M5Sp4+rbOmK9TbsLn8ZEA= k8s.io/utils v0.0.0-20191010214722-8d271d903fe4/go.mod h1:sZAwmy6armz5eXlNoLmJcl4F1QuKu7sr+mFQ0byX7Ew= +modernc.org/cc v1.0.0 h1:nPibNuDEx6tvYrUAtvDTTw98rx5juGsa5zuDnKwEEQQ= modernc.org/cc v1.0.0/go.mod h1:1Sk4//wdnYJiUIxnW8ddKpaOJCF37yAdqYnkxUpaYxw= +modernc.org/golex v1.0.0 h1:wWpDlbK8ejRfSyi0frMyhilD3JBvtcx2AdGDnU+JtsE= modernc.org/golex v1.0.0/go.mod h1:b/QX9oBD/LhixY6NDh+IdGv17hgB+51fET1i2kPSmvk= +modernc.org/mathutil v1.0.0 h1:93vKjrJopTPrtTNpZ8XIovER7iCIH1QU7wNbOQXC60I= modernc.org/mathutil v1.0.0/go.mod h1:wU0vUrJsVWBZ4P6e7xtFJEhFSNsfRLJ8H458uRjg03k= +modernc.org/strutil v1.0.0 h1:XVFtQwFVwc02Wk+0L/Z/zDDXO81r5Lhe6iMKmGX3KhE= modernc.org/strutil v1.0.0/go.mod h1:lstksw84oURvj9y3tn8lGvRxyRC1S2+g5uuIzNfIOBs= +modernc.org/xc v1.0.0 h1:7ccXrupWZIS3twbUGrtKmHS2DXY6xegFua+6O3xgAFU= modernc.org/xc v1.0.0/go.mod h1:mRNCo0bvLjGhHO9WsyuKVU4q0ceiDDDoEeWDJHrNx8I= sigs.k8s.io/kustomize v2.0.3+incompatible h1:JUufWFNlI44MdtnjUqVnvh29rR37PQFzPbLXqhyOyX0= sigs.k8s.io/kustomize v2.0.3+incompatible/go.mod h1:MkjgH3RdOWrievjo6c9T245dYlB5QeXV4WCbnt/PEpU= +sigs.k8s.io/structured-merge-diff v0.0.0-20190525122527-15d366b2352e h1:4Z09Hglb792X0kfOBBJUPFEyvVfQWrYT/l8h5EKA6JQ= sigs.k8s.io/structured-merge-diff v0.0.0-20190525122527-15d366b2352e/go.mod h1:wWxsB5ozmmv/SG7nM11ayaAW51xMvak/t1r0CSlcokI= sigs.k8s.io/yaml v1.1.0 h1:4A07+ZFc2wgJwo8YNlQpr1rVlgUDlxXHhPJciaPY5gs= sigs.k8s.io/yaml v1.1.0/go.mod h1:UJmg0vDUVViEyp3mgSv9WPwZCDxu4rQW1olrI1uml+o= diff --git a/pkg/config/config.go b/pkg/config/config.go index 3fa7aef0..df2b2fbe 100644 --- a/pkg/config/config.go +++ b/pkg/config/config.go @@ -82,17 +82,21 @@ const ( prometheusPortDefault = 9092 prometheusPortConfigKey = "PROMETHEUS_SERVER_PORT" // probes - enableProbesDefault = false - enableProbesConfigKey = "ENABLE_PROBES_SERVER" - probesPortDefault = 8080 - probesPortConfigKey = "PROBES_SERVER_PORT" - probesEndpointDefault = "/healthz" - probesEndpointConfigKey = "PROBES_SERVER_ENDPOINT" - region = "" - awsRegionConfigKey = "AWS_REGION" - awsEndpointConfigKey = "AWS_ENDPOINT" - queueURL = "" - queueURLConfigKey = "QUEUE_URL" + enableProbesDefault = false + enableProbesConfigKey = "ENABLE_PROBES_SERVER" + probesPortDefault = 8080 + probesPortConfigKey = "PROBES_SERVER_PORT" + probesEndpointDefault = "/healthz" + probesEndpointConfigKey = "PROBES_SERVER_ENDPOINT" + emitKubernetesEventsConfigKey = "EMIT_KUBERNETES_EVENTS" + emitKubernetesEventsDefault = false + kubernetesEventsAnnotationsConfigKey = "KUBERNETES_EVENTS_ANNOTATIONS" + kubernetesEventsAnnotationsDefault = "" + region = "" + awsRegionConfigKey = "AWS_REGION" + awsEndpointConfigKey = "AWS_ENDPOINT" + queueURL = "" + queueURLConfigKey = "QUEUE_URL" ) //Config arguments set via CLI, environment variables, or defaults @@ -129,6 +133,8 @@ type Config struct { EnableProbes bool ProbesPort int ProbesEndpoint string + EmitKubernetesEvents bool + KubernetesEventsAnnotations string AWSRegion string AWSEndpoint string QueueURL string @@ -180,6 +186,8 @@ func ParseCliArgs() (config Config, err error) { flag.BoolVar(&config.EnableProbes, "enable-probes-server", getBoolEnv(enableProbesConfigKey, enableProbesDefault), "If true, a http server is used for exposing probes in /healthz endpoint.") flag.IntVar(&config.ProbesPort, "probes-server-port", getIntEnv(probesPortConfigKey, probesPortDefault), "The port for running the probes http server.") flag.StringVar(&config.ProbesEndpoint, "probes-server-endpoint", getEnv(probesEndpointConfigKey, probesEndpointDefault), "If specified, use this endpoint to make liveness probe") + flag.BoolVar(&config.EmitKubernetesEvents, "emit-kubernetes-events", getBoolEnv(emitKubernetesEventsConfigKey, emitKubernetesEventsDefault), "If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes") + flag.StringVar(&config.KubernetesEventsAnnotations, "kubernetes-events-annotations", getEnv(kubernetesEventsAnnotationsConfigKey, ""), "A comma-separated list of key=value annotations to attach to all emitted Kubernetes events. Example: --kubernetes-events-annotations first=annotation,sample.annotation/number=two") flag.StringVar(&config.AWSRegion, "aws-region", getEnv(awsRegionConfigKey, ""), "If specified, use the AWS region for AWS API calls") flag.StringVar(&config.AWSEndpoint, "aws-endpoint", getEnv(awsEndpointConfigKey, ""), "[testing] If specified, use the AWS endpoint to make API calls") flag.StringVar(&config.QueueURL, "queue-url", getEnv(queueURLConfigKey, ""), "Listens for messages on the specified SQS queue URL") @@ -269,6 +277,8 @@ func (c Config) PrintJsonConfigArgs() { Str("uptime_from_file", c.UptimeFromFile). Bool("enable_prometheus_server", c.EnablePrometheus). Int("prometheus_server_port", c.PrometheusPort). + Bool("emit_kubernetes_events", c.EmitKubernetesEvents). + Str("kubernetes_events_annotations", c.KubernetesEventsAnnotations). Str("aws_region", c.AWSRegion). Str("aws_endpoint", c.AWSEndpoint). Str("queue_url", c.QueueURL). @@ -312,6 +322,8 @@ func (c Config) PrintHumanConfigArgs() { "\tuptime-from-file: %s,\n"+ "\tenable-prometheus-server: %t,\n"+ "\tprometheus-server-port: %d,\n"+ + "\temit-kubernetes-events: %t,\n"+ + "\tkubernetes-events-annotations: %s,\n"+ "\taws-region: %s,\n"+ "\tqueue-url: %s,\n"+ "\tcheck-asg-tag-before-draining: %t,\n"+ @@ -343,6 +355,8 @@ func (c Config) PrintHumanConfigArgs() { c.UptimeFromFile, c.EnablePrometheus, c.PrometheusPort, + c.EmitKubernetesEvents, + c.KubernetesEventsAnnotations, c.AWSRegion, c.QueueURL, c.CheckASGTagBeforeDraining, diff --git a/pkg/observability/k8s-events.go b/pkg/observability/k8s-events.go new file mode 100644 index 00000000..e94b9a08 --- /dev/null +++ b/pkg/observability/k8s-events.go @@ -0,0 +1,166 @@ +// Copyright 2016-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. +// +// Licensed under the Apache License, Version 2.0 (the "License"). You may +// not use this file except in compliance with the License. A copy of the +// License is located at +// +// http://aws.amazon.com/apache2.0/ +// +// or in the "license" file accompanying this file. This file is distributed +// on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either +// express or implied. See the License for the specific language governing +// permissions and limitations under the License. + +package observability + +import ( + "fmt" + "strings" + + "github.com/aws/aws-node-termination-handler/pkg/monitor/rebalancerecommendation" + "github.com/aws/aws-node-termination-handler/pkg/monitor/scheduledevent" + "github.com/aws/aws-node-termination-handler/pkg/monitor/spotitn" + "github.com/aws/aws-node-termination-handler/pkg/monitor/sqsevent" + corev1 "k8s.io/api/core/v1" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/client-go/kubernetes" + "k8s.io/client-go/kubernetes/scheme" + typedcorev1 "k8s.io/client-go/kubernetes/typed/core/v1" + "k8s.io/client-go/rest" + "k8s.io/client-go/tools/record" +) + +// Kubernetes event types, reasons and messages +const ( + Normal = corev1.EventTypeNormal + Warning = corev1.EventTypeWarning + MonitorErrReason = "MonitorError" + MonitorErrMsgFmt = "There was a problem monitoring for events in monitor %q" + UncordonErrReason = "UncordonError" + UncordonErrMsgFmt = "There was a problem while trying to uncordon the node: %s" + UncordonReason = "Uncordon" + UncordonMsg = "Node successfully uncordoned" + PreDrainErrReason = "PreDrainError" + PreDrainErrMsgFmt = "There was a problem executing the pre-drain task: %s" + PreDrainReason = "PreDrain" + PreDrainMsg = "Pre-drain task successfully executed" + CordonErrReason = "CordonError" + CordonErrMsgFmt = "There was a problem while trying to cordon the node: %s" + CordonReason = "Cordon" + CordonMsg = "Node successfully cordoned" + CordonAndDrainErrReason = "CordonAndDrainError" + CordonAndDrainErrMsgFmt = "There was a problem while trying to cordon and drain the node: %s" + CordonAndDrainReason = "CordonAndDrain" + CordonAndDrainMsg = "Node successfully cordoned and drained" + PostDrainErrReason = "PostDrainError" + PostDrainErrMsgFmt = "There was a problem executing the post-drain task: %s" + PostDrainReason = "PostDrain" + PostDrainMsg = "Post-drain task successfully executed" +) + +// Interruption event reasons +const ( + scheduledEventReason = "ScheduledEvent" + spotITNReason = "SpotInterruption" + sqsTerminateReason = "SQSTermination" + rebalanceRecommendationReason = "RebalanceRecommendation" + unknownReason = "UnknownInterruptionEvent" +) + +// K8sEventRecorder wraps a Kubernetes event recorder with some extra information +type K8sEventRecorder struct { + annotations map[string]string + enabled bool + node *corev1.Node + record.EventRecorder +} + +// InitK8sEventRecorder creates a Kubernetes event recorder +func InitK8sEventRecorder(enabled bool, annotationsStr, nodeName string) (K8sEventRecorder, error) { + if !enabled { + return K8sEventRecorder{}, nil + } + + // Parse annotations + var annotations map[string]string + var err error + if annotationsStr != "" { + annotations, err = parseAnnotations(annotationsStr) + if err != nil { + return K8sEventRecorder{}, err + } + } + + // Get in-cluster config + config, err := rest.InClusterConfig() + if err != nil { + return K8sEventRecorder{}, err + } + + // Create clientSet + clientSet, err := kubernetes.NewForConfig(config) + if err != nil { + return K8sEventRecorder{}, err + } + + // Get node + node, err := clientSet.CoreV1().Nodes().Get(nodeName, metav1.GetOptions{}) + if err != nil { + return K8sEventRecorder{}, err + } + + // Create broadcaster + broadcaster := record.NewBroadcaster() + broadcaster.StartRecordingToSink(&typedcorev1.EventSinkImpl{Interface: clientSet.CoreV1().Events("default")}) + + // Create event recorder + return K8sEventRecorder{ + annotations: annotations, + enabled: true, + node: node, + EventRecorder: broadcaster.NewRecorder( + scheme.Scheme, + corev1.EventSource{ + Component: "aws-node-termination-handler", + Host: nodeName, + }, + ), + }, nil +} + +// Emit a Kubernetes event for the current node and with the given type, reason and message +func (r K8sEventRecorder) Emit(eventType, eventReason, eventMsgFmt string, eventMsgArgs ...interface{}) { + if r.enabled { + r.AnnotatedEventf(r.node, r.annotations, eventType, eventReason, eventMsgFmt, eventMsgArgs...) + } +} + +// GetReasonForKind returns a Kubernetes event reason for the given interruption event kind +func GetReasonForKind(kind string) string { + switch kind { + case scheduledevent.ScheduledEventKind: + return scheduledEventReason + case spotitn.SpotITNKind: + return spotITNReason + case sqsevent.SQSTerminateKind: + return sqsTerminateReason + case rebalancerecommendation.RebalanceRecommendationKind: + return rebalanceRecommendationReason + default: + return unknownReason + } +} + +// Convert the given annotations string into a map +func parseAnnotations(annotationsStr string) (map[string]string, error) { + annotations := make(map[string]string) + parts := strings.Split(annotationsStr, ",") + for _, part := range parts { + keyValue := strings.Split(part, "=") + if len(keyValue) != 2 { + return nil, fmt.Errorf("error parsing annotations") + } + annotations[keyValue[0]] = keyValue[1] + } + return annotations, nil +} diff --git a/test/README.md b/test/README.md index 2295ea1d..26f68c1e 100644 --- a/test/README.md +++ b/test/README.md @@ -15,9 +15,9 @@ This doc details how the end-to-end (e2e) tests work for aws-node-termination-ha The e2e tests can be run on a local cluster or an eks cluster using one of the following `make` targets: * `make e2e-test` * creates a [local kind cluster](https://github.com/aws/aws-node-termination-handler/blob/main/test/k8s-local-cluster-test/kind-three-node-cluster.yaml) - + * `make eks-cluster-test` - * creates an [eks cluster](https://github.com/aws/aws-node-termination-handler/blob/main/test/eks-cluster-test/cluster-spec.yaml) + * creates an [eks cluster](https://github.com/aws/aws-node-termination-handler/blob/main/test/eks-cluster-test/cluster-spec.yaml) * *Note if testing Windows, `eks-cluster-test` must be used* **Using Test Drivers** @@ -26,7 +26,7 @@ Users can also kick off the tests by invoking the test driver scripts: * **local cluster:** `./test/k8s-local-cluster-test/run-test` * **eks cluster:** `./test/eks-cluster-test/run-test` -By invoking the test drivers directly, users will be able to pass in supported parameters to tailor their test run accordingly. For example, +By invoking the test drivers directly, users will be able to pass in supported parameters to tailor their test run accordingly. For example, use **-p when starting a local cluster test to PRESERVE the created cluster:** `./test/k8s-local-cluster-test/run-test -b e2e-test -d -p` Whether the tests succeed or fail, the cluster will be preserved for further exploration. By default, the cluster will be deleted regardless of test status. @@ -39,7 +39,7 @@ As noted in [eks-cluster-test/run-test](https://github.com/aws/aws-node-terminat #### Example -Using [maintenance-event-cancellation-test](https://github.com/aws/aws-node-termination-handler/blob/main/test/e2e/maintenance-event-cancellation-test) as an example. +Using [maintenance-event-cancellation-test](https://github.com/aws/aws-node-termination-handler/blob/main/test/e2e/maintenance-event-cancellation-test) as an example. Keep in mind what NTH is expected to do: **...cordon the node to ensure no new work is scheduled there, then drain it, removing any existing work** - [NTH ReadMe](https://github.com/aws/aws-node-termination-handler) diff --git a/test/e2e/rebalance-recommendation-drain-test b/test/e2e/rebalance-recommendation-drain-test index d2e6a28a..9ff49350 100755 --- a/test/e2e/rebalance-recommendation-drain-test +++ b/test/e2e/rebalance-recommendation-drain-test @@ -128,7 +128,6 @@ for i in $(seq 1 $TAINT_CHECK_CYCLES); do if [[ $cordoned -eq 1 && $(kubectl get deployments regular-pod-test -o=jsonpath='{.status.unavailableReplicas}') -eq 1 ]]; then echo "✅ Verified the regular-pod-test pod was evicted!" - echo "✅ Rebalance Recommendation Drain test passed!" evicted=1 break fi @@ -160,6 +159,4 @@ for i in $(seq 1 $TAINT_CHECK_CYCLES); do done echo "❌ regular-pod-test was NOT evicted" - -echo "❌ Rebalance Recommendation Drain Test Failed $CLUSTER_NAME ❌" fail_and_exit 1 diff --git a/test/e2e/spot-interruption-test-events-on b/test/e2e/spot-interruption-test-events-on new file mode 100755 index 00000000..f5f61aa7 --- /dev/null +++ b/test/e2e/spot-interruption-test-events-on @@ -0,0 +1,164 @@ +#!/bin/bash +set -euo pipefail + +# Available env vars: +# $TMP_DIR +# $CLUSTER_NAME +# $KUBECONFIG +# $NODE_TERMINATION_HANDLER_DOCKER_REPO +# $NODE_TERMINATION_HANDLER_DOCKER_TAG +# $WEBHOOK_DOCKER_REPO +# $WEBHOOK_DOCKER_TAG +# $AEMM_URL +# $AEMM_VERSION + +function fail_and_exit { + echo "❌ Spot Interruption With Events test failed $CLUSTER_NAME ❌" + exit ${1:-1} +} + +echo "Starting Spot Interruption With Events Test for Node Termination Handler" + +SCRIPTPATH="$( cd "$(dirname "$0")" ; pwd -P )" + +common_helm_args=() +[[ "${TEST_WINDOWS-}" == "true" ]] && common_helm_args+=(--set targetNodeOs="windows") +[[ -n "${NTH_WORKER_LABEL-}" ]] && common_helm_args+=(--set nodeSelector."$NTH_WORKER_LABEL") + +anth_helm_args=( + upgrade + --install + "$CLUSTER_NAME-anth" + "$SCRIPTPATH/../../config/helm/aws-node-termination-handler/" + --wait + --force + --namespace kube-system + --set instanceMetadataURL="${INSTANCE_METADATA_URL:-"http://$AEMM_URL:$IMDS_PORT"}" + --set image.repository="$NODE_TERMINATION_HANDLER_DOCKER_REPO" + --set image.tag="$NODE_TERMINATION_HANDLER_DOCKER_TAG" + --set enableScheduledEventDraining="false" + --set enableSpotInterruptionDraining="true" + --set taintNode="true" + --set tolerations="" + --set emitKubernetesEvents="true" + --set kubernetesEventsAnnotations="spot.itn.events/test=annotation" +) +[[ -n "${NODE_TERMINATION_HANDLER_DOCKER_PULL_POLICY-}" ]] && + anth_helm_args+=(--set image.pullPolicy="$NODE_TERMINATION_HANDLER_DOCKER_PULL_POLICY") +[[ ${#common_helm_args[@]} -gt 0 ]] && + anth_helm_args+=("${common_helm_args[@]}") + +set -x +helm "${anth_helm_args[@]}" +set +x + +emtp_helm_args=( + upgrade + --install + "$CLUSTER_NAME-emtp" + "$SCRIPTPATH/../../config/helm/webhook-test-proxy/" + --wait + --force + --namespace default + --set webhookTestProxy.image.repository="$WEBHOOK_DOCKER_REPO" + --set webhookTestProxy.image.tag="$WEBHOOK_DOCKER_TAG" +) +[[ -n "${WEBHOOK_DOCKER_PULL_POLICY-}" ]] && + emtp_helm_args+=(--set webhookTestProxy.image.pullPolicy="$WEBHOOK_DOCKER_PULL_POLICY") +[[ ${#common_helm_args[@]} -gt 0 ]] && + emtp_helm_args+=("${common_helm_args[@]}") + +set -x +helm "${emtp_helm_args[@]}" +set +x + +aemm_helm_args=( + upgrade + --install + "$CLUSTER_NAME-aemm" + "$AEMM_DL_URL" + --wait + --namespace default + --set servicePort="$IMDS_PORT" + --set 'tolerations[0].effect=NoSchedule' + --set 'tolerations[0].operator=Exists' + --set arguments='{spot}' +) +[[ ${#common_helm_args[@]} -gt 0 ]] && + aemm_helm_args+=("${common_helm_args[@]}") + +set -x +retry 5 helm "${aemm_helm_args[@]}" +set +x + +TAINT_CHECK_CYCLES=15 +TAINT_CHECK_SLEEP=15 + +deployed=0 +for i in `seq 1 $TAINT_CHECK_CYCLES`; do + if [[ $(kubectl get deployments regular-pod-test -o jsonpath='{.status.unavailableReplicas}') -eq 0 ]]; then + echo "✅ Verified regular-pod-test pod was scheduled and started!" + deployed=1 + break + fi + echo "Setup Loop $i/$TAINT_CHECK_CYCLES, sleeping for $TAINT_CHECK_SLEEP seconds" + sleep $TAINT_CHECK_SLEEP +done + +if [[ $deployed -eq 0 ]]; then + echo "❌ regular-pod-test pod deployment failed" + fail_and_exit 2 +fi + +cordoned=0 +tainted=0 +evicted=0 +test_node=${TEST_NODE:-$CLUSTER_NAME-worker} +for i in `seq 1 $TAINT_CHECK_CYCLES`; do + if [[ $cordoned -eq 0 ]] && kubectl get nodes $test_node | grep SchedulingDisabled >/dev/null; then + echo "✅ Verified the worker node was cordoned!" + cordoned=1 + fi + + if [[ $cordoned -eq 1 && $tainted -eq 0 ]] && kubectl get nodes $test_node -o json | grep -q "aws-node-termination-handler/spot-itn" >/dev/null; then + echo "✅ Verified the worked node was tainted!" + tainted=1 + fi + + if [[ $tainted -eq 1 && $(kubectl get deployments regular-pod-test -o=jsonpath='{.status.unavailableReplicas}') -eq 1 ]]; then + echo "✅ Verified the regular-pod-test pod was evicted!" + evicted=1 + break + fi + echo "Assertion Loop $i/$TAINT_CHECK_CYCLES, sleeping for $TAINT_CHECK_SLEEP seconds" + sleep $TAINT_CHECK_SLEEP +done + +if [[ $cordoned -eq 0 ]]; then + echo "❌ Worker node was not cordoned" + fail_and_exit 3 +elif [[ $tainted -eq 0 ]]; then + echo "❌ Worker node was not tainted" + fail_and_exit 3 +elif [[ $evicted -eq 0 ]]; then + echo "❌ regular-pod-test pod was not evicted" + fail_and_exit 3 +fi + +echo "🥑 Getting Kubernetes events..." +events=$(kubectl get events --field-selector source=aws-node-termination-handler -o json) +for reason in SpotInterruption PreDrain CordonAndDrain; do + set +e + event=$(echo $events | jq -e --arg REASON "$reason" '.items[] | select(.reason==$REASON)' | jq -es '.[0]') + if [[ $? -ne 0 ]]; then + echo "❌ Events with reason $reason were not emitted" + fail_and_exit 1 + fi + set -e + if [[ "$(echo $event | jq -r '.metadata.annotations["spot.itn.events/test"]')" != "annotation" ]]; then + echo "❌ Annotation was not found on event with reason $reason" + fail_and_exit 1 + fi +done +echo "✅ Spot Interruption With Events Test Passed $CLUSTER_NAME! ✅" +exit 0 From 9c8f4996fde8564e676de1218ed12e2acec7aeb9 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Fri, 16 Apr 2021 12:13:55 +0200 Subject: [PATCH 02/17] test: add retry loop, cosmetics --- pkg/observability/k8s-events.go | 4 +-- test/e2e/spot-interruption-test-events-on | 43 +++++++++++++++-------- 2 files changed, 31 insertions(+), 16 deletions(-) diff --git a/pkg/observability/k8s-events.go b/pkg/observability/k8s-events.go index e94b9a08..a9f2eda1 100644 --- a/pkg/observability/k8s-events.go +++ b/pkg/observability/k8s-events.go @@ -35,7 +35,7 @@ const ( Normal = corev1.EventTypeNormal Warning = corev1.EventTypeWarning MonitorErrReason = "MonitorError" - MonitorErrMsgFmt = "There was a problem monitoring for events in monitor %q" + MonitorErrMsgFmt = "There was a problem monitoring for events in monitor '%s'" UncordonErrReason = "UncordonError" UncordonErrMsgFmt = "There was a problem while trying to uncordon the node: %s" UncordonReason = "Uncordon" @@ -128,7 +128,7 @@ func InitK8sEventRecorder(enabled bool, annotationsStr, nodeName string) (K8sEve }, nil } -// Emit a Kubernetes event for the current node and with the given type, reason and message +// Emit a Kubernetes event for the current node and with the given event type, reason and message func (r K8sEventRecorder) Emit(eventType, eventReason, eventMsgFmt string, eventMsgArgs ...interface{}) { if r.enabled { r.AnnotatedEventf(r.node, r.annotations, eventType, eventReason, eventMsgFmt, eventMsgArgs...) diff --git a/test/e2e/spot-interruption-test-events-on b/test/e2e/spot-interruption-test-events-on index f5f61aa7..f295ef8c 100755 --- a/test/e2e/spot-interruption-test-events-on +++ b/test/e2e/spot-interruption-test-events-on @@ -146,19 +146,34 @@ elif [[ $evicted -eq 0 ]]; then fi echo "🥑 Getting Kubernetes events..." -events=$(kubectl get events --field-selector source=aws-node-termination-handler -o json) -for reason in SpotInterruption PreDrain CordonAndDrain; do - set +e - event=$(echo $events | jq -e --arg REASON "$reason" '.items[] | select(.reason==$REASON)' | jq -es '.[0]') - if [[ $? -ne 0 ]]; then - echo "❌ Events with reason $reason were not emitted" - fail_and_exit 1 - fi - set -e - if [[ "$(echo $event | jq -r '.metadata.annotations["spot.itn.events/test"]')" != "annotation" ]]; then - echo "❌ Annotation was not found on event with reason $reason" - fail_and_exit 1 +for i in `seq 1 $TAINT_CHECK_CYCLES`; do + eventnotfound="" + annotationnotfound="" + events=$(kubectl get events --field-selector source=aws-node-termination-handler -o json) + for reason in SpotInterruption PreDrain CordonAndDrain; do + set +e + event=$(echo $events | jq -e --arg REASON "$reason" '[.items[] | select(.reason==$REASON)][0]') + if [[ $? -ne 0 ]]; then + eventnotfound=$reason + break + fi + set -e + if [[ "$(echo $event | jq -r '.metadata.annotations["spot.itn.events/test"]')" != "annotation" ]]; then + annotationnotfound=$reason + break + fi + done + if [ -z $eventnotfound ] && [ -z $annotationnotfound ]; then + echo "✅ Spot Interruption With Events Test Passed $CLUSTER_NAME! ✅" + exit 0 fi + echo "Events Loop $i/$TAINT_CHECK_CYCLES, sleeping for $TAINT_CHECK_SLEEP seconds" + sleep $TAINT_CHECK_SLEEP done -echo "✅ Spot Interruption With Events Test Passed $CLUSTER_NAME! ✅" -exit 0 + +if [ ! -z $eventnotfound ]; then + echo "❌ Event with reason $eventnotfound was not emitted" + fail_and_exit 1 +fi +echo "❌ Annotation was not found on event with reason $annotationnotfound" +fail_and_exit_1 From e9a58f79d23832216480be291060c2545feed9b7 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Fri, 16 Apr 2021 13:03:51 +0200 Subject: [PATCH 03/17] test: fix prometheus metrics retry loop --- test/e2e/prometheus-metrics-test | 56 ++++++++++---------------------- 1 file changed, 17 insertions(+), 39 deletions(-) diff --git a/test/e2e/prometheus-metrics-test b/test/e2e/prometheus-metrics-test index 5070e628..5e10e885 100755 --- a/test/e2e/prometheus-metrics-test +++ b/test/e2e/prometheus-metrics-test @@ -145,47 +145,25 @@ echo "✅ Port-forwarded pod $POD_NAME" sleep 10 -for i in $(seq 1 10); do +for i in `seq 1 $TAINT_CHECK_CYCLES`; do METRICS_RESPONSE=$(curl -L localhost:7000/metrics) echo "✅ Fetched /metrics." - - if [[ $METRICS_RESPONSE == *"cordon-and-drain"* ]]; then - echo "✅ Metric cordon-and-drain!" - else - echo "❌ Failed checking metric for cordon-and-drain" - EXIT_STATUS=3 - fi - - if [[ $METRICS_RESPONSE == *"pre-drain"* ]]; then - echo "✅ Metric pre-drain!" - else - echo "❌ Failed checking metric for pre-drain" - EXIT_STATUS=3 - fi - - if [[ $METRICS_RESPONSE == *"runtime_go_gc"* ]]; then - echo "✅ Metric runtime_go_gc!" - else - echo "❌ Failed checking runtime_go_gc metric" - EXIT_STATUS=3 - fi - - if [[ $METRICS_RESPONSE == *"runtime_go_goroutines"* ]]; then - echo "✅ Metric runtime_go_goroutines!" - else - echo "❌ Failed checking runtime_go_goroutines metric" - EXIT_STATUS=3 - fi - - if [[ $METRICS_RESPONSE == *"runtime_go_mem"* ]]; then - echo "✅ Metric runtime_go_mem!" - else - echo "❌ Failed checking runtime_go_mem metric" - EXIT_STATUS=3 + failed="" + for METRIC in cordon-and-drain pre-drain runtime_go_gc runtime_go_goroutines runtime_go_mem; do + if [[ $METRICS_RESPONSE == *"$METRIC"* ]]; then + echo "✅ Metric $METRIC!" + else + echo "⚠️ Metric $METRIC" + failed=$METRIC + break + fi + done + if [ -z $failed ]; then + exit 0 fi - - sleep 10 - [[ $EXIT_STATUS -ne 0 ]] || break + echo "Metrics Loop $i/$TAINT_CHECK_CYCLES, sleeping for $TAINT_CHECK_SLEEP seconds" + sleep $TAINT_CHECK_SLEEP done -exit $EXIT_STATUS +echo "❌ Failed checking metric for $METRIC" +exit 3 From f6c239f9170b425c343f4c5f7f46557b167f4b2e Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Tue, 20 Apr 2021 15:41:02 +0200 Subject: [PATCH 04/17] chore: rename to UnknownInterruption --- pkg/observability/k8s-events.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pkg/observability/k8s-events.go b/pkg/observability/k8s-events.go index a9f2eda1..9042ce42 100644 --- a/pkg/observability/k8s-events.go +++ b/pkg/observability/k8s-events.go @@ -64,7 +64,7 @@ const ( spotITNReason = "SpotInterruption" sqsTerminateReason = "SQSTermination" rebalanceRecommendationReason = "RebalanceRecommendation" - unknownReason = "UnknownInterruptionEvent" + unknownReason = "UnknownInterruption" ) // K8sEventRecorder wraps a Kubernetes event recorder with some extra information From 28ceed26dbd97bf7869ca5eda837bcdd3a6c2957 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Wed, 21 Apr 2021 15:12:12 +0200 Subject: [PATCH 05/17] chore: add default set of annotations from IMDS --- cmd/node-termination-handler.go | 13 +- .../aws-node-termination-handler/README.md | 7 +- .../templates/daemonset.linux.yaml | 4 +- .../templates/daemonset.windows.yaml | 4 +- .../templates/deployment.yaml | 4 +- .../aws-node-termination-handler/values.yaml | 6 +- docs/aemm_interruption_testing.md | 8 +- docs/kubernetes_events.md | 78 ++++++++++++ pkg/config/config.go | 116 +++++++++--------- pkg/observability/k8s-events.go | 30 +++-- test/e2e/spot-interruption-test-events-on | 22 +++- 11 files changed, 199 insertions(+), 93 deletions(-) create mode 100644 docs/kubernetes_events.md diff --git a/cmd/node-termination-handler.go b/cmd/node-termination-handler.go index 7368b28b..d10af9e8 100644 --- a/cmd/node-termination-handler.go +++ b/cmd/node-termination-handler.go @@ -99,12 +99,6 @@ func main() { log.Fatal().Err(err).Msg("Unable to instantiate probes service,") } - recorder, err := observability.InitK8sEventRecorder(nthConfig.EmitKubernetesEvents, nthConfig.KubernetesEventsAnnotations, nthConfig.NodeName) - if err != nil { - nthConfig.Print() - log.Fatal().Err(err).Msg("Unable to create Kubernetes event recorder,") - } - imds := ec2metadata.New(nthConfig.MetadataURL, nthConfig.MetadataTries) interruptionEventStore := interruptioneventstore.New(nthConfig) @@ -119,6 +113,13 @@ func main() { nthConfig.Print() log.Fatal().Msgf("Unable to find the AWS region to process queue events.") } + + recorder, err := observability.InitK8sEventRecorder(nthConfig.EmitKubernetesEvents, nthConfig.NodeName, nodeMetadata, nthConfig.KubernetesEventsExtraAnnotations) + if err != nil { + nthConfig.Print() + log.Fatal().Err(err).Msg("Unable to create Kubernetes event recorder,") + } + nthConfig.Print() if nthConfig.EnableScheduledEventDraining { diff --git a/config/helm/aws-node-termination-handler/README.md b/config/helm/aws-node-termination-handler/README.md index fd4fc6bd..b8e9abf3 100644 --- a/config/helm/aws-node-termination-handler/README.md +++ b/config/helm/aws-node-termination-handler/README.md @@ -9,10 +9,13 @@ AWS Node Termination Handler Helm chart for Kubernetes. For more information on ## Installing the Chart Add the EKS repository to Helm: + ```sh helm repo add eks https://aws.github.io/eks-charts ``` + Install AWS Node Termination Handler: + To install the chart with the release name aws-node-termination-handler and default configuration: ```sh @@ -82,8 +85,8 @@ Parameter | Description | Default `podMonitor.sampleLimit` | Number of scraped samples accepted | `5000` `podMonitor.labels` | Additional PodMonitor metadata labels | `{}` `podMonitor.namespace` | override podMonitor Helm release namespace | `{{ .Release.Namespace }}` -`emitKubernetesEvents` | If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes | `false` -`kubernetesEventsAnnotations` | A comma-separated list of key=value annotations to attach to all emitted Kubernetes events. Example: `first=annotation,sample.annotation/number=two"` | None +`emitKubernetesEvents` | If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. A default set of annotations with all the node metadata gathered from IMDS will be attached to each event. More information [here](https://github.com/aws/aws-node-termination-handler/blob/main/docs/kubernetes_events.md) | `false` +`kubernetesExtraEventsAnnotations` | A comma-separated list of key=value extra annotations to attach to all emitted Kubernetes events. Example: `first=annotation,sample.annotation/number=two"` | None ### AWS Node Termination Handler - Queue-Processor Mode Configuration diff --git a/config/helm/aws-node-termination-handler/templates/daemonset.linux.yaml b/config/helm/aws-node-termination-handler/templates/daemonset.linux.yaml index 27a536dd..41dda316 100644 --- a/config/helm/aws-node-termination-handler/templates/daemonset.linux.yaml +++ b/config/helm/aws-node-termination-handler/templates/daemonset.linux.yaml @@ -176,8 +176,8 @@ spec: value: {{ .Values.probesServerEndpoint | quote }} - name: EMIT_KUBERNETES_EVENTS value: {{ .Values.emitKubernetesEvents | quote }} - - name: KUBERNETES_EVENTS_ANNOTATIONS - value: {{ .Values.kubernetesEventsAnnotations | quote }} + - name: KUBERNETES_EVENTS_EXTRA_ANNOTATIONS + value: {{ .Values.kubernetesEventsExtraAnnotations | quote }} resources: {{- toYaml .Values.resources | nindent 12 }} {{- if .Values.enablePrometheusServer }} diff --git a/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml b/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml index 98df588e..7c7babcf 100644 --- a/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml +++ b/config/helm/aws-node-termination-handler/templates/daemonset.windows.yaml @@ -150,8 +150,8 @@ spec: value: {{ .Values.probesServerEndpoint | quote }} - name: EMIT_KUBERNETES_EVENTS value: {{ .Values.emitKubernetesEvents | quote }} - - name: KUBERNETES_EVENTS_ANNOTATIONS - value: {{ .Values.kubernetesEventsAnnotations | quote }} + - name: KUBERNETES_EVENTS_EXTRA_ANNOTATIONS + value: {{ .Values.kubernetesEventsExtraAnnotations | quote }} resources: {{- toYaml .Values.resources | nindent 12 }} {{- if .Values.enablePrometheusServer }} diff --git a/config/helm/aws-node-termination-handler/templates/deployment.yaml b/config/helm/aws-node-termination-handler/templates/deployment.yaml index bc681872..aa617579 100644 --- a/config/helm/aws-node-termination-handler/templates/deployment.yaml +++ b/config/helm/aws-node-termination-handler/templates/deployment.yaml @@ -152,8 +152,8 @@ spec: value: {{ .Values.workers | quote }} - name: EMIT_KUBERNETES_EVENTS value: {{ .Values.emitKubernetesEvents | quote }} - - name: KUBERNETES_EVENTS_ANNOTATIONS - value: {{ .Values.kubernetesEventsAnnotations | quote }} + - name: KUBERNETES_EVENTS_EXTRA_ANNOTATIONS + value: {{ .Values.kubernetesEventsExtraAnnotations | quote }} resources: {{- toYaml .Values.resources | nindent 12 }} {{- if .Values.enablePrometheusServer }} diff --git a/config/helm/aws-node-termination-handler/values.yaml b/config/helm/aws-node-termination-handler/values.yaml index 1a655d9d..d2f47694 100644 --- a/config/helm/aws-node-termination-handler/values.yaml +++ b/config/helm/aws-node-termination-handler/values.yaml @@ -159,12 +159,12 @@ enableProbesServer: false probesServerPort: 8080 probesServerEndpoint: "/healthz" -# emitKubernetesEvents If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes +# emitKubernetesEvents If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. A default set of annotations with all the node metadata gathered from IMDS will be attached to each event emitKubernetesEvents: false -# kubernetesEventsAnnotations A comma-separated list of key=value annotations to attach to all emitted Kubernetes events +# kubernetesEventsExtraAnnotations A comma-separated list of key=value extra annotations to attach to all emitted Kubernetes events # Example: "first=annotation,sample.annotation/number=two" -kubernetesEventsAnnotations: "" +kubernetesEventsExtraAnnotations: "" tolerations: - operator: "Exists" diff --git a/docs/aemm_interruption_testing.md b/docs/aemm_interruption_testing.md index c7f1e7fd..07fcb9c4 100644 --- a/docs/aemm_interruption_testing.md +++ b/docs/aemm_interruption_testing.md @@ -54,17 +54,17 @@ WARNING: ignoring DaemonSet-managed Pods: default/amazon-ec2-metadata-mock-pszj2 This isn't a mistake, by default AEMM will respond to any request for metadata with a spot interruption occurring 2 minutes later than the request time.\* AWS Node Termination Handler polls for events every 2 seconds by default, so the effect is -that new interruption events are found and processed every 2 seconds. +that new interruption events are found and processed every 2 seconds. In reality there will only be a single interruption event, and you can mock this by setting the `spot.time` parameter of -AEMM when installing it. +AEMM when installing it. ``` helm install amazon-ec2-metadata-mock amazon-ec2-metadata-mock-1.6.0.tgz \ --set aemm.spot.time="2020-09-09T22:40:47Z" \ --namespace default ``` -Now when you check the logs you should only see a single event get processed. +Now when you check the logs you should only see a single event get processed. For more ways of configuring AEMM check out the [Helm configuration page](https://github.com/aws/amazon-ec2-metadata-mock/tree/main/helm/amazon-ec2-metadata-mock). @@ -82,7 +82,7 @@ for the local tests that use a kind cluster, and [here](https://github.com/aws/a for the eks-cluster e2e tests. Check out the [ReadMe](https://github.com/aws/aws-node-termination-handler/tree/main/test) in our test folder for more -info on the e2e tests. +info on the e2e tests. --- diff --git a/docs/kubernetes_events.md b/docs/kubernetes_events.md new file mode 100644 index 00000000..dc64da00 --- /dev/null +++ b/docs/kubernetes_events.md @@ -0,0 +1,78 @@ +# AWS Node Termination Handler Kubernetes events + +AWS Node Termination Handler has the ability to emit a Kubernetes event every time an interruption signal is sent from AWS and also every time an operation is attempted on a node. More information on how to get events can be found [here](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application-introspection/). + +## Configuration + +There are two relevant parameters: + +* `emit-kubernetes-events` + + If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. Defaults to `false` + +* `kubernetes-events-extra-annotations` + + A comma-separated list of `key=value` extra annotations to attach to all emitted Kubernetes events. Example: + + `"first=annotation,sample.annotation/number=two"` + +## Event reasons + +There are a number of events that can be emitted, each one with a reason that can be used to quickly identify the event nature and for filtering. Each event will also have a message with extended information. Here's a reasons summary: + +AWS interruption event reasons: + +* `RebalanceRecommendation` +* `ScheduledEvent` +* `SQSTermination` +* `SpotInterruption` + +Node action reasons: + +* `Cordon` +* `CordonError` +* `CordonAndDrain` +* `CordonAndDrainError` +* `PreDrain` +* `PreDrainError` +* `PostDrain` +* `PostDrainError` +* `Uncordon` +* `UncordonError` +* `MonitorError` + +## Default annotations + +If events emission is enabled, AWS Node Termination Handler will automatically inject a set of annotations to each event it emits. Such annotations are gathered from the underlying host's IMDS endpoint and enrich each event with information about the host that emitted it. + +The default annotations are: + +Name | Example value +--- | --- +`account-id` | `123456789012` +`availability-zone` | `us-west-2a` +`instance-id` | `i-abcdef12345678901` +`instance-type` | `m5.8xlarge` +`local-hostname` | `ip-10-1-2-3.us-west-2.compute.internal` +`local-ipv4` | `10.1.2.3` +`public-hostname` | `my-example.host.net` +`public-ipv4` | `42.42.42.42` +`region` | `us-west-2` + +If extra annotations are specified they will be appended to the above. In case of collision, the user-defined annotation wins. + +## How to get events + +All events belong to Kubernetes `Node` objects so they belong in the `default` namespace. The event source is `aws-node-termination-handler`. From command line, use `kubectl` to get the events as follows: + +```sh +kubectl get events --field-selector "source=aws-node-termination-handler" +``` + +To narrow down the search you can use multiple field selectors, like: + +```sh +kubectl get events --field-selector "reason=SpotInterruption,involvedObject.name=ip-10-1-2-3.us-west-2.compute.internal" +``` + +Results can also be printed out in JSON or YAML format and piped to processors like `jq` or `yq`. Then, the above annotations can also be used for discovery and filtering. diff --git a/pkg/config/config.go b/pkg/config/config.go index df2b2fbe..7df5e2d0 100644 --- a/pkg/config/config.go +++ b/pkg/config/config.go @@ -82,64 +82,64 @@ const ( prometheusPortDefault = 9092 prometheusPortConfigKey = "PROMETHEUS_SERVER_PORT" // probes - enableProbesDefault = false - enableProbesConfigKey = "ENABLE_PROBES_SERVER" - probesPortDefault = 8080 - probesPortConfigKey = "PROBES_SERVER_PORT" - probesEndpointDefault = "/healthz" - probesEndpointConfigKey = "PROBES_SERVER_ENDPOINT" - emitKubernetesEventsConfigKey = "EMIT_KUBERNETES_EVENTS" - emitKubernetesEventsDefault = false - kubernetesEventsAnnotationsConfigKey = "KUBERNETES_EVENTS_ANNOTATIONS" - kubernetesEventsAnnotationsDefault = "" - region = "" - awsRegionConfigKey = "AWS_REGION" - awsEndpointConfigKey = "AWS_ENDPOINT" - queueURL = "" - queueURLConfigKey = "QUEUE_URL" + enableProbesDefault = false + enableProbesConfigKey = "ENABLE_PROBES_SERVER" + probesPortDefault = 8080 + probesPortConfigKey = "PROBES_SERVER_PORT" + probesEndpointDefault = "/healthz" + probesEndpointConfigKey = "PROBES_SERVER_ENDPOINT" + emitKubernetesEventsConfigKey = "EMIT_KUBERNETES_EVENTS" + emitKubernetesEventsDefault = false + kubernetesEventsExtraAnnotationsConfigKey = "KUBERNETES_EVENTS_EXTRA_ANNOTATIONS" + kubernetesEventsExtraAnnotationsDefault = "" + region = "" + awsRegionConfigKey = "AWS_REGION" + awsEndpointConfigKey = "AWS_ENDPOINT" + queueURL = "" + queueURLConfigKey = "QUEUE_URL" ) //Config arguments set via CLI, environment variables, or defaults type Config struct { - DryRun bool - NodeName string - MetadataURL string - IgnoreDaemonSets bool - DeleteLocalData bool - KubernetesServiceHost string - KubernetesServicePort string - PodTerminationGracePeriod int - NodeTerminationGracePeriod int - WebhookURL string - WebhookHeaders string - WebhookTemplate string - WebhookTemplateFile string - WebhookProxy string - EnableScheduledEventDraining bool - EnableSpotInterruptionDraining bool - EnableSQSTerminationDraining bool - EnableRebalanceMonitoring bool - EnableRebalanceDraining bool - CheckASGTagBeforeDraining bool - ManagedAsgTag string - MetadataTries int - CordonOnly bool - TaintNode bool - JsonLogging bool - LogLevel string - UptimeFromFile string - EnablePrometheus bool - PrometheusPort int - EnableProbes bool - ProbesPort int - ProbesEndpoint string - EmitKubernetesEvents bool - KubernetesEventsAnnotations string - AWSRegion string - AWSEndpoint string - QueueURL string - Workers int - AWSSession *session.Session + DryRun bool + NodeName string + MetadataURL string + IgnoreDaemonSets bool + DeleteLocalData bool + KubernetesServiceHost string + KubernetesServicePort string + PodTerminationGracePeriod int + NodeTerminationGracePeriod int + WebhookURL string + WebhookHeaders string + WebhookTemplate string + WebhookTemplateFile string + WebhookProxy string + EnableScheduledEventDraining bool + EnableSpotInterruptionDraining bool + EnableSQSTerminationDraining bool + EnableRebalanceMonitoring bool + EnableRebalanceDraining bool + CheckASGTagBeforeDraining bool + ManagedAsgTag string + MetadataTries int + CordonOnly bool + TaintNode bool + JsonLogging bool + LogLevel string + UptimeFromFile string + EnablePrometheus bool + PrometheusPort int + EnableProbes bool + ProbesPort int + ProbesEndpoint string + EmitKubernetesEvents bool + KubernetesEventsExtraAnnotations string + AWSRegion string + AWSEndpoint string + QueueURL string + Workers int + AWSSession *session.Session } //ParseCliArgs parses cli arguments and uses environment variables as fallback values @@ -187,7 +187,7 @@ func ParseCliArgs() (config Config, err error) { flag.IntVar(&config.ProbesPort, "probes-server-port", getIntEnv(probesPortConfigKey, probesPortDefault), "The port for running the probes http server.") flag.StringVar(&config.ProbesEndpoint, "probes-server-endpoint", getEnv(probesEndpointConfigKey, probesEndpointDefault), "If specified, use this endpoint to make liveness probe") flag.BoolVar(&config.EmitKubernetesEvents, "emit-kubernetes-events", getBoolEnv(emitKubernetesEventsConfigKey, emitKubernetesEventsDefault), "If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes") - flag.StringVar(&config.KubernetesEventsAnnotations, "kubernetes-events-annotations", getEnv(kubernetesEventsAnnotationsConfigKey, ""), "A comma-separated list of key=value annotations to attach to all emitted Kubernetes events. Example: --kubernetes-events-annotations first=annotation,sample.annotation/number=two") + flag.StringVar(&config.KubernetesEventsExtraAnnotations, "kubernetes-events-extra-annotations", getEnv(kubernetesEventsExtraAnnotationsConfigKey, ""), "A comma-separated list of key=value extra annotations to attach to all emitted Kubernetes events. Example: --kubernetes-events-extra-annotations first=annotation,sample.annotation/number=two") flag.StringVar(&config.AWSRegion, "aws-region", getEnv(awsRegionConfigKey, ""), "If specified, use the AWS region for AWS API calls") flag.StringVar(&config.AWSEndpoint, "aws-endpoint", getEnv(awsEndpointConfigKey, ""), "[testing] If specified, use the AWS endpoint to make API calls") flag.StringVar(&config.QueueURL, "queue-url", getEnv(queueURLConfigKey, ""), "Listens for messages on the specified SQS queue URL") @@ -278,7 +278,7 @@ func (c Config) PrintJsonConfigArgs() { Bool("enable_prometheus_server", c.EnablePrometheus). Int("prometheus_server_port", c.PrometheusPort). Bool("emit_kubernetes_events", c.EmitKubernetesEvents). - Str("kubernetes_events_annotations", c.KubernetesEventsAnnotations). + Str("kubernetes_events_extra_annotations", c.KubernetesEventsExtraAnnotations). Str("aws_region", c.AWSRegion). Str("aws_endpoint", c.AWSEndpoint). Str("queue_url", c.QueueURL). @@ -323,7 +323,7 @@ func (c Config) PrintHumanConfigArgs() { "\tenable-prometheus-server: %t,\n"+ "\tprometheus-server-port: %d,\n"+ "\temit-kubernetes-events: %t,\n"+ - "\tkubernetes-events-annotations: %s,\n"+ + "\tkubernetes-events-extra-annotations: %s,\n"+ "\taws-region: %s,\n"+ "\tqueue-url: %s,\n"+ "\tcheck-asg-tag-before-draining: %t,\n"+ @@ -356,7 +356,7 @@ func (c Config) PrintHumanConfigArgs() { c.EnablePrometheus, c.PrometheusPort, c.EmitKubernetesEvents, - c.KubernetesEventsAnnotations, + c.KubernetesEventsExtraAnnotations, c.AWSRegion, c.QueueURL, c.CheckASGTagBeforeDraining, diff --git a/pkg/observability/k8s-events.go b/pkg/observability/k8s-events.go index 9042ce42..9b2a51e6 100644 --- a/pkg/observability/k8s-events.go +++ b/pkg/observability/k8s-events.go @@ -17,6 +17,7 @@ import ( "fmt" "strings" + "github.com/aws/aws-node-termination-handler/pkg/ec2metadata" "github.com/aws/aws-node-termination-handler/pkg/monitor/rebalancerecommendation" "github.com/aws/aws-node-termination-handler/pkg/monitor/scheduledevent" "github.com/aws/aws-node-termination-handler/pkg/monitor/spotitn" @@ -76,16 +77,28 @@ type K8sEventRecorder struct { } // InitK8sEventRecorder creates a Kubernetes event recorder -func InitK8sEventRecorder(enabled bool, annotationsStr, nodeName string) (K8sEventRecorder, error) { +func InitK8sEventRecorder(enabled bool, nodeName string, nodeMetadata ec2metadata.NodeMetadata, extraAnnotationsStr string) (K8sEventRecorder, error) { if !enabled { return K8sEventRecorder{}, nil } - // Parse annotations - var annotations map[string]string + // Create default annotations + // Worth iterating over nodeMetadata fields using reflect? (trutx) + annotations := make(map[string]string) + annotations["account-id"] = nodeMetadata.AccountId + annotations["availability-zone"] = nodeMetadata.AvailabilityZone + annotations["instance-id"] = nodeMetadata.InstanceID + annotations["instance-type"] = nodeMetadata.InstanceType + annotations["local-hostname"] = nodeMetadata.LocalHostname + annotations["local-ipv4"] = nodeMetadata.LocalIP + annotations["public-hostname"] = nodeMetadata.PublicHostname + annotations["public-ipv4"] = nodeMetadata.PublicIP + annotations["region"] = nodeMetadata.Region + + // Parse extra annotations var err error - if annotationsStr != "" { - annotations, err = parseAnnotations(annotationsStr) + if extraAnnotationsStr != "" { + annotations, err = parseExtraAnnotations(annotations, extraAnnotationsStr) if err != nil { return K8sEventRecorder{}, err } @@ -151,10 +164,9 @@ func GetReasonForKind(kind string) string { } } -// Convert the given annotations string into a map -func parseAnnotations(annotationsStr string) (map[string]string, error) { - annotations := make(map[string]string) - parts := strings.Split(annotationsStr, ",") +// Parse the given extra annotations string into a map +func parseExtraAnnotations(annotations map[string]string, extraAnnotationsStr string) (map[string]string, error) { + parts := strings.Split(extraAnnotationsStr, ",") for _, part := range parts { keyValue := strings.Split(part, "=") if len(keyValue) != 2 { diff --git a/test/e2e/spot-interruption-test-events-on b/test/e2e/spot-interruption-test-events-on index f295ef8c..42dca1c4 100755 --- a/test/e2e/spot-interruption-test-events-on +++ b/test/e2e/spot-interruption-test-events-on @@ -41,7 +41,7 @@ anth_helm_args=( --set taintNode="true" --set tolerations="" --set emitKubernetesEvents="true" - --set kubernetesEventsAnnotations="spot.itn.events/test=annotation" + --set kubernetesEventsExtraAnnotations="spot.itn.events/test=extra-annotation" ) [[ -n "${NODE_TERMINATION_HANDLER_DOCKER_PULL_POLICY-}" ]] && anth_helm_args+=(--set image.pullPolicy="$NODE_TERMINATION_HANDLER_DOCKER_PULL_POLICY") @@ -149,6 +149,7 @@ echo "🥑 Getting Kubernetes events..." for i in `seq 1 $TAINT_CHECK_CYCLES`; do eventnotfound="" annotationnotfound="" + extraannotationnotfound="" events=$(kubectl get events --field-selector source=aws-node-termination-handler -o json) for reason in SpotInterruption PreDrain CordonAndDrain; do set +e @@ -158,12 +159,19 @@ for i in `seq 1 $TAINT_CHECK_CYCLES`; do break fi set -e - if [[ "$(echo $event | jq -r '.metadata.annotations["spot.itn.events/test"]')" != "annotation" ]]; then - annotationnotfound=$reason + for ant in account-id availability-zone instance-id instance-type local-hostname local-ipv4 public-hostname public-ipv4 region; do + if [[ "$(echo $event | jq -r --arg ANT "$ant" '.metadata.annotations[$ANT]')" == "null" ]]; then + eventnotfound=$reason + annotationnotfound=$ant + break 2 + fi + done + if [[ "$(echo $event | jq -r '.metadata.annotations["spot.itn.events/test"]')" != "extra-annotation" ]]; then + extraannotationnotfound=$reason break fi done - if [ -z $eventnotfound ] && [ -z $annotationnotfound ]; then + if [ -z $eventnotfound ] && [ -z $annotationnotfound ] && [ -z $extraannotationnotfound ]; then echo "✅ Spot Interruption With Events Test Passed $CLUSTER_NAME! ✅" exit 0 fi @@ -172,8 +180,12 @@ for i in `seq 1 $TAINT_CHECK_CYCLES`; do done if [ ! -z $eventnotfound ]; then + if [ ! -z $annotationnotfound ]; then + echo "❌ Annotation $annotationnotfound was not found on event with reason $eventnotfound" + fail_and_exit 1 + fi echo "❌ Event with reason $eventnotfound was not emitted" fail_and_exit 1 fi -echo "❌ Annotation was not found on event with reason $annotationnotfound" +echo "❌ Extra annotation was not found on event with reason $extraannotationnotfound" fail_and_exit_1 From d7cc644ceee96ee0e3b49a24d5fc61337724e9d0 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Fri, 23 Apr 2021 12:35:51 +0200 Subject: [PATCH 06/17] chore: changes per @haugenj and @brycahta requests --- cmd/node-termination-handler.go | 25 ++++++++-------- .../templates/clusterrole.yaml | 5 ---- docs/kubernetes_events.md | 12 ++++++-- pkg/observability/k8s-events.go | 29 ++++++------------- test/e2e/spot-interruption-test-events-on | 6 ++-- 5 files changed, 34 insertions(+), 43 deletions(-) diff --git a/cmd/node-termination-handler.go b/cmd/node-termination-handler.go index d10af9e8..91c93caa 100644 --- a/cmd/node-termination-handler.go +++ b/cmd/node-termination-handler.go @@ -186,7 +186,7 @@ func main() { if err != nil { log.Warn().Str("event_type", monitor.Kind()).Err(err).Msg("There was a problem monitoring for events") metrics.ErrorEventsInc(monitor.Kind()) - recorder.Emit(observability.Warning, observability.MonitorErrReason, observability.MonitorErrMsgFmt, monitor.Kind()) + recorder.Emit(nthConfig.NodeName, observability.Warning, observability.MonitorErrReason, observability.MonitorErrMsgFmt, monitor.Kind()) if previousErr != nil && err.Error() == previousErr.Error() { duplicateErrCount++ } else { @@ -222,7 +222,7 @@ func main() { case interruptionEventStore.Workers <- 1: event.InProgress = true wg.Add(1) - recorder.Emit(observability.Normal, observability.GetReasonForKind(event.Kind), event.Description) + recorder.Emit(event.NodeName, observability.Normal, observability.GetReasonForKind(event.Kind), event.Description) go drainOrCordonIfNecessary(interruptionEventStore, event, *node, nthConfig, nodeMetadata, metrics, recorder, &wg) default: log.Warn().Msg("all workers busy, waiting") @@ -273,10 +273,11 @@ func watchForCancellationEvents(cancelChan <-chan monitor.InterruptionEvent, int err := node.Uncordon(nodeName) if err != nil { log.Err(err).Msg("Uncordoning the node failed") - recorder.Emit(observability.Warning, observability.UncordonErrReason, observability.UncordonErrMsgFmt, err.Error()) + recorder.Emit(nodeName, observability.Warning, observability.UncordonErrReason, observability.UncordonErrMsgFmt, err.Error()) + } else { + recorder.Emit(nodeName, observability.Normal, observability.UncordonReason, observability.UncordonMsg) } metrics.NodeActionsInc("uncordon", nodeName, err) - recorder.Emit(observability.Normal, observability.UncordonReason, observability.UncordonMsg) node.RemoveNTHLabels(nodeName) node.RemoveNTHTaints(nodeName) @@ -298,9 +299,9 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto err := drainEvent.PreDrainTask(*drainEvent, node) if err != nil { log.Err(err).Msg("There was a problem executing the pre-drain task") - recorder.Emit(observability.Warning, observability.PreDrainErrReason, observability.PreDrainErrMsgFmt, err.Error()) + recorder.Emit(nodeName, observability.Warning, observability.PreDrainErrReason, observability.PreDrainErrMsgFmt, err.Error()) } else { - recorder.Emit(observability.Normal, observability.PreDrainReason, observability.PreDrainMsg) + recorder.Emit(nodeName, observability.Normal, observability.PreDrainReason, observability.PreDrainMsg) } metrics.NodeActionsInc("pre-drain", nodeName, err) } @@ -312,7 +313,7 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto log.Err(err).Msgf("node '%s' not found in the cluster", nodeName) } else { log.Err(err).Msg("There was a problem while trying to cordon the node") - recorder.Emit(observability.Warning, observability.CordonErrReason, observability.CordonErrMsgFmt, err.Error()) + recorder.Emit(nodeName, observability.Warning, observability.CordonErrReason, observability.CordonErrMsgFmt, err.Error()) os.Exit(1) } } else { @@ -327,7 +328,7 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto log.Err(err).Msg("There was a problem while trying to log all pod names on the node") } metrics.NodeActionsInc("cordon", nodeName, err) - recorder.Emit(observability.Normal, observability.CordonReason, observability.CordonMsg) + recorder.Emit(nodeName, observability.Normal, observability.CordonReason, observability.CordonMsg) } } else { err := node.CordonAndDrain(nodeName) @@ -337,13 +338,13 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto } else { log.Err(err).Msg("There was a problem while trying to cordon and drain the node") metrics.NodeActionsInc("cordon-and-drain", nodeName, err) - recorder.Emit(observability.Warning, observability.CordonAndDrainErrReason, observability.CordonAndDrainErrMsgFmt, err.Error()) + recorder.Emit(nodeName, observability.Warning, observability.CordonAndDrainErrReason, observability.CordonAndDrainErrMsgFmt, err.Error()) os.Exit(1) } } else { log.Info().Str("node_name", nodeName).Msg("Node successfully cordoned and drained") metrics.NodeActionsInc("cordon-and-drain", nodeName, err) - recorder.Emit(observability.Normal, observability.CordonAndDrainReason, observability.CordonAndDrainMsg) + recorder.Emit(nodeName, observability.Normal, observability.CordonAndDrainReason, observability.CordonAndDrainMsg) } } @@ -355,9 +356,9 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto err := drainEvent.PostDrainTask(*drainEvent, node) if err != nil { log.Err(err).Msg("There was a problem executing the post-drain task") - recorder.Emit(observability.Warning, observability.PostDrainErrReason, observability.PostDrainErrMsgFmt, err.Error()) + recorder.Emit(nodeName, observability.Warning, observability.PostDrainErrReason, observability.PostDrainErrMsgFmt, err.Error()) } else { - recorder.Emit(observability.Normal, observability.PostDrainReason, observability.PostDrainMsg) + recorder.Emit(nodeName, observability.Normal, observability.PostDrainReason, observability.PostDrainMsg) } metrics.NodeActionsInc("post-drain", nodeName, err) } diff --git a/config/helm/aws-node-termination-handler/templates/clusterrole.yaml b/config/helm/aws-node-termination-handler/templates/clusterrole.yaml index 42f8c6cc..32a385db 100644 --- a/config/helm/aws-node-termination-handler/templates/clusterrole.yaml +++ b/config/helm/aws-node-termination-handler/templates/clusterrole.yaml @@ -44,9 +44,4 @@ rules: - events verbs: - create - - get - - list - - patch - - update - - watch {{- end }} diff --git a/docs/kubernetes_events.md b/docs/kubernetes_events.md index dc64da00..504a7d21 100644 --- a/docs/kubernetes_events.md +++ b/docs/kubernetes_events.md @@ -43,7 +43,7 @@ Node action reasons: ## Default annotations -If events emission is enabled, AWS Node Termination Handler will automatically inject a set of annotations to each event it emits. Such annotations are gathered from the underlying host's IMDS endpoint and enrich each event with information about the host that emitted it. +If `emit-kubernetes-events` is enabled, AWS Node Termination Handler will automatically inject a set of annotations to each event it emits. Such annotations are gathered from the underlying host's IMDS endpoint and enrich each event with information about the host that emitted it. The default annotations are: @@ -59,7 +59,7 @@ Name | Example value `public-ipv4` | `42.42.42.42` `region` | `us-west-2` -If extra annotations are specified they will be appended to the above. In case of collision, the user-defined annotation wins. +If `kubernetes-events-extra-annotations` are specified they will be appended to the above. In case of collision, the user-defined annotation wins. ## How to get events @@ -76,3 +76,11 @@ kubectl get events --field-selector "reason=SpotInterruption,involvedObject.name ``` Results can also be printed out in JSON or YAML format and piped to processors like `jq` or `yq`. Then, the above annotations can also be used for discovery and filtering. + +## Caveats + +### Default annotations in Queue Processor Mode + +Default annotations values are gathered from the IMDS endpoint local to the Node on which AWS Node Termination Handler runs. This is fine when running on IMDS Processor Mode since an AWS Node Termination Handler Pod will be deployed to all Nodes via a `DaemonSet` and each Node will emit all events related to itself with its own default annotations. + +However, when running in Queue Processor Mode AWS Node Termination Handler is deployed to a number of Nodes (1 replica by default) since it's done via a `Deployment`. In that case the default annotations values will be gathered from the Node(s) running AWS Node Termination Handler, and so the values in the default annotations stamped to all events will match those of the Node from which the event was emitted, not those of the Node of which the event is about. diff --git a/pkg/observability/k8s-events.go b/pkg/observability/k8s-events.go index 9b2a51e6..372cd415 100644 --- a/pkg/observability/k8s-events.go +++ b/pkg/observability/k8s-events.go @@ -1,4 +1,4 @@ -// Copyright 2016-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. +// Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. // // Licensed under the Apache License, Version 2.0 (the "License"). You may // not use this file except in compliance with the License. A copy of the @@ -23,7 +23,6 @@ import ( "github.com/aws/aws-node-termination-handler/pkg/monitor/spotitn" "github.com/aws/aws-node-termination-handler/pkg/monitor/sqsevent" corev1 "k8s.io/api/core/v1" - metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/client-go/kubernetes" "k8s.io/client-go/kubernetes/scheme" typedcorev1 "k8s.io/client-go/kubernetes/typed/core/v1" @@ -72,7 +71,6 @@ const ( type K8sEventRecorder struct { annotations map[string]string enabled bool - node *corev1.Node record.EventRecorder } @@ -82,8 +80,6 @@ func InitK8sEventRecorder(enabled bool, nodeName string, nodeMetadata ec2metadat return K8sEventRecorder{}, nil } - // Create default annotations - // Worth iterating over nodeMetadata fields using reflect? (trutx) annotations := make(map[string]string) annotations["account-id"] = nodeMetadata.AccountId annotations["availability-zone"] = nodeMetadata.AvailabilityZone @@ -95,7 +91,6 @@ func InitK8sEventRecorder(enabled bool, nodeName string, nodeMetadata ec2metadat annotations["public-ipv4"] = nodeMetadata.PublicIP annotations["region"] = nodeMetadata.Region - // Parse extra annotations var err error if extraAnnotationsStr != "" { annotations, err = parseExtraAnnotations(annotations, extraAnnotationsStr) @@ -104,33 +99,22 @@ func InitK8sEventRecorder(enabled bool, nodeName string, nodeMetadata ec2metadat } } - // Get in-cluster config config, err := rest.InClusterConfig() if err != nil { return K8sEventRecorder{}, err } - // Create clientSet clientSet, err := kubernetes.NewForConfig(config) if err != nil { return K8sEventRecorder{}, err } - // Get node - node, err := clientSet.CoreV1().Nodes().Get(nodeName, metav1.GetOptions{}) - if err != nil { - return K8sEventRecorder{}, err - } - - // Create broadcaster broadcaster := record.NewBroadcaster() broadcaster.StartRecordingToSink(&typedcorev1.EventSinkImpl{Interface: clientSet.CoreV1().Events("default")}) - // Create event recorder return K8sEventRecorder{ annotations: annotations, enabled: true, - node: node, EventRecorder: broadcaster.NewRecorder( scheme.Scheme, corev1.EventSource{ @@ -141,10 +125,15 @@ func InitK8sEventRecorder(enabled bool, nodeName string, nodeMetadata ec2metadat }, nil } -// Emit a Kubernetes event for the current node and with the given event type, reason and message -func (r K8sEventRecorder) Emit(eventType, eventReason, eventMsgFmt string, eventMsgArgs ...interface{}) { +// Emit a Kubernetes event for the given node and with the given event type, reason and message +func (r K8sEventRecorder) Emit(nodeName string, eventType, eventReason, eventMsgFmt string, eventMsgArgs ...interface{}) { if r.enabled { - r.AnnotatedEventf(r.node, r.annotations, eventType, eventReason, eventMsgFmt, eventMsgArgs...) + node := &corev1.ObjectReference{ + Kind: "Node", + Name: nodeName, + Namespace: "default", + } + r.AnnotatedEventf(node, r.annotations, eventType, eventReason, eventMsgFmt, eventMsgArgs...) } } diff --git a/test/e2e/spot-interruption-test-events-on b/test/e2e/spot-interruption-test-events-on index 42dca1c4..2611a11f 100755 --- a/test/e2e/spot-interruption-test-events-on +++ b/test/e2e/spot-interruption-test-events-on @@ -152,13 +152,11 @@ for i in `seq 1 $TAINT_CHECK_CYCLES`; do extraannotationnotfound="" events=$(kubectl get events --field-selector source=aws-node-termination-handler -o json) for reason in SpotInterruption PreDrain CordonAndDrain; do - set +e - event=$(echo $events | jq -e --arg REASON "$reason" '[.items[] | select(.reason==$REASON)][0]') - if [[ $? -ne 0 ]]; then + event=$(echo $events | jq --arg REASON "$reason" '[.items[] | select(.reason==$REASON)][0]') + if [[ $event == "null" ]]; then eventnotfound=$reason break fi - set -e for ant in account-id availability-zone instance-id instance-type local-hostname local-ipv4 public-hostname public-ipv4 region; do if [[ "$(echo $event | jq -r --arg ANT "$ant" '.metadata.annotations[$ANT]')" == "null" ]]; then eventnotfound=$reason From e4796ab7a68e6b30d4badfee9ebad722a2107d99 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Fri, 23 Apr 2021 22:30:14 +0200 Subject: [PATCH 07/17] docs: elevate note about queue processor mode --- docs/kubernetes_events.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/kubernetes_events.md b/docs/kubernetes_events.md index 504a7d21..11326e3f 100644 --- a/docs/kubernetes_events.md +++ b/docs/kubernetes_events.md @@ -45,6 +45,8 @@ Node action reasons: If `emit-kubernetes-events` is enabled, AWS Node Termination Handler will automatically inject a set of annotations to each event it emits. Such annotations are gathered from the underlying host's IMDS endpoint and enrich each event with information about the host that emitted it. +_**NOTE**: In Queue Processor mode, these annotations will reflect the node running NTH not the node receiving the events. See [Caveats](https://github.com/aws/aws-node-termination-handler/blob/main/docs/kubernetes_events.md#caveats) for more information._ + The default annotations are: Name | Example value @@ -63,7 +65,7 @@ If `kubernetes-events-extra-annotations` are specified they will be appended to ## How to get events -All events belong to Kubernetes `Node` objects so they belong in the `default` namespace. The event source is `aws-node-termination-handler`. From command line, use `kubectl` to get the events as follows: +All events are about Kubernetes `Node` objects so they belong in the `default` namespace. The event source is `aws-node-termination-handler`. From command line, use `kubectl` to get the events as follows: ```sh kubectl get events --field-selector "source=aws-node-termination-handler" @@ -83,4 +85,4 @@ Results can also be printed out in JSON or YAML format and piped to processors l Default annotations values are gathered from the IMDS endpoint local to the Node on which AWS Node Termination Handler runs. This is fine when running on IMDS Processor Mode since an AWS Node Termination Handler Pod will be deployed to all Nodes via a `DaemonSet` and each Node will emit all events related to itself with its own default annotations. -However, when running in Queue Processor Mode AWS Node Termination Handler is deployed to a number of Nodes (1 replica by default) since it's done via a `Deployment`. In that case the default annotations values will be gathered from the Node(s) running AWS Node Termination Handler, and so the values in the default annotations stamped to all events will match those of the Node from which the event was emitted, not those of the Node of which the event is about. +However, when running in Queue Processor Mode AWS Node Termination Handler is deployed to a number of Nodes (1 replica by default) via a `Deployment`. In that case the default annotations values will be gathered from the Node(s) running AWS Node Termination Handler, and so the values in the default annotations stamped to all events will match those of the Node from which the event was emitted, not those of the Node of which the event is about. From a1b235820fa7c9f8543dac89b93a217ff6a8243d Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Fri, 23 Apr 2021 23:08:32 +0200 Subject: [PATCH 08/17] chore: add instance-life-cycle --- README.md | 21 +++++++++++++++++---- docs/kubernetes_events.md | 1 + pkg/ec2metadata/ec2metadata.go | 22 +++++++++++++--------- pkg/observability/k8s-events.go | 1 + test/e2e/spot-interruption-test-events-on | 2 +- 5 files changed, 33 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 4cc2b374..72fec58b 100644 --- a/README.md +++ b/README.md @@ -102,6 +102,7 @@ The termination handler DaemonSet installs into your cluster a [ServiceAccount]( #### Kubectl Apply You can use kubectl to directly add all of the above resources with the default configuration into your cluster. + ``` kubectl apply -f https://github.com/aws/aws-node-termination-handler/releases/download/v1.13.0/all-resources.yaml ``` @@ -121,6 +122,7 @@ helm repo add eks https://aws.github.io/eks-charts Once that is complete you can install the termination handler. We've provided some sample setup options below. Zero Config: + ```sh helm upgrade --install aws-node-termination-handler \ --namespace kube-system \ @@ -128,6 +130,7 @@ helm upgrade --install aws-node-termination-handler \ ``` Enabling Features: + ``` helm upgrade --install aws-node-termination-handler \ --namespace kube-system \ @@ -140,6 +143,7 @@ helm upgrade --install aws-node-termination-handler \ The `enable*` configuration flags above enable or disable IMDS monitoring paths. Running Only On Specific Nodes: + ``` helm upgrade --install aws-node-termination-handler \ --namespace kube-system \ @@ -148,6 +152,7 @@ helm upgrade --install aws-node-termination-handler \ ``` Webhook Configuration: + ``` helm upgrade --install aws-node-termination-handler \ --namespace kube-system \ @@ -156,6 +161,7 @@ helm upgrade --install aws-node-termination-handler \ ``` Alternatively, pass Webhook URL as a Secret: + ``` WEBHOOKURL_LITERAL="webhookurl=https://hooks.slack.com/services/YOUR/SLACK/URL" @@ -217,11 +223,9 @@ However, if your account is dedicated to ASGs for your kubernetes cluster, then You can also control what resources NTH manages by adding the resource ARNs to your Amazon EventBridge rules. -Take a look at the docs on how to create rules that only manage certain ASGs here: https://docs.aws.amazon.com/autoscaling/ec2/userguide/cloud-watch-events.html - -See all the different events docs here: https://docs.aws.amazon.com/eventbridge/latest/userguide/event-types.html#auto-scaling-event-types - +Take a look at the docs on how to create rules that only manage certain ASGs [here](https://docs.aws.amazon.com/autoscaling/ec2/userguide/cloud-watch-events.html). +See all the different events docs [here](https://docs.aws.amazon.com/eventbridge/latest/userguide/event-types.html#auto-scaling-event-types). #### 3. Create an SQS Queue: @@ -298,6 +302,7 @@ There are many different ways to allow the aws-node-termination-handler pods to 4. [kube2iam](https://github.com/jtblin/kube2iam) IAM Policy for aws-node-termination-handler Deployment: + ``` { "Version": "2012-10-17", @@ -333,6 +338,7 @@ helm repo add eks https://aws.github.io/eks-charts Once that is complete you can install the termination handler. We've provided some sample setup options below. Minimal Config: + ```sh helm upgrade --install aws-node-termination-handler \ --namespace kube-system \ @@ -342,6 +348,7 @@ helm upgrade --install aws-node-termination-handler \ ``` Webhook Configuration: + ``` helm upgrade --install aws-node-termination-handler \ --namespace kube-system \ @@ -352,6 +359,7 @@ helm upgrade --install aws-node-termination-handler \ ``` Alternatively, pass Webhook URL as a Secret: + ``` WEBHOOKURL_LITERAL="webhookurl=https://hooks.slack.com/services/YOUR/SLACK/URL" @@ -397,22 +405,26 @@ To use the termination handler alongside [Kiam](https://github.com/uswitch/kiam) By default Kiam will block all access to the metadata address, so you need to make sure it passes through the requests the termination handler relies on. To add a whitelist configuration, use the following fields in the Kiam Helm chart values: + ``` agent.whiteListRouteRegexp: '^\/latest\/meta-data\/(spot\/instance-action|events\/maintenance\/scheduled|instance-(id|type)|public-(hostname|ipv4)|local-(hostname|ipv4)|placement\/availability-zone)|\/latest\/dynamic\/instance-identity\/document$' ``` Or just pass it as an argument to the kiam agents: + ``` kiam agent --whitelist-route-regexp='^\/latest\/meta-data\/(spot\/instance-action|events\/maintenance\/scheduled|instance-(id|type)|public-(hostname|ipv4)|local-(hostname|ipv4)|placement\/availability-zone)|\/latest\/dynamic\/instance-identity\/document$' ``` ## Metadata endpoints The termination handler relies on the following metadata endpoints to function properly: + ``` /latest/dynamic/instance-identity/document /latest/meta-data/spot/instance-action /latest/meta-data/events/recommendations/rebalance /latest/meta-data/events/maintenance/scheduled /latest/meta-data/instance-id +/latest/meta-data/instance-life-cycle /latest/meta-data/instance-type /latest/meta-data/public-hostname /latest/meta-data/public-ipv4 @@ -420,6 +432,7 @@ The termination handler relies on the following metadata endpoints to function p /latest/meta-data/local-ipv4 /latest/meta-data/placement/availability-zone ``` + ## Building diff --git a/docs/kubernetes_events.md b/docs/kubernetes_events.md index 11326e3f..1865d85e 100644 --- a/docs/kubernetes_events.md +++ b/docs/kubernetes_events.md @@ -54,6 +54,7 @@ Name | Example value `account-id` | `123456789012` `availability-zone` | `us-west-2a` `instance-id` | `i-abcdef12345678901` +`instance-life-cycle` | `spot` `instance-type` | `m5.8xlarge` `local-hostname` | `ip-10-1-2-3.us-west-2.compute.internal` `local-ipv4` | `10.1.2.3` diff --git a/pkg/ec2metadata/ec2metadata.go b/pkg/ec2metadata/ec2metadata.go index c83ec757..f41b777d 100644 --- a/pkg/ec2metadata/ec2metadata.go +++ b/pkg/ec2metadata/ec2metadata.go @@ -36,6 +36,8 @@ const ( RebalanceRecommendationPath = "/latest/meta-data/events/recommendations/rebalance" // InstanceIDPath path to instance id InstanceIDPath = "/latest/meta-data/instance-id" + // InstanceLifeCycle path to instance life cycle + InstanceLifeCycle = "/latest/meta-data/instance-life-cycle" // InstanceTypePath path to instance type InstanceTypePath = "/latest/meta-data/instance-type" // PublicHostnamePath path to public hostname @@ -104,15 +106,16 @@ type RebalanceRecommendation struct { // NodeMetadata contains information that applies to every drain event type NodeMetadata struct { - AccountId string `json:"accountId"` - InstanceID string `json:"instanceId"` - InstanceType string `json:"instanceType"` - PublicHostname string `json:"publicHostname"` - PublicIP string `json:"publicIp"` - LocalHostname string `json:"localHostname"` - LocalIP string `json:"privateIp"` - AvailabilityZone string `json:"availabilityZone"` - Region string `json:"region"` + AccountId string `json:"accountId"` + InstanceID string `json:"instanceId"` + InstanceLifeCycle string `json:"instanceLifeCycle"` + InstanceType string `json:"instanceType"` + PublicHostname string `json:"publicHostname"` + PublicIP string `json:"publicIp"` + LocalHostname string `json:"localHostname"` + LocalIP string `json:"privateIp"` + AvailabilityZone string `json:"availabilityZone"` + Region string `json:"region"` } // New constructs an instance of the Service client @@ -328,6 +331,7 @@ func (e *Service) GetNodeMetadata() NodeMetadata { if err != nil { log.Warn().Msg("Unable to fetch instance identity document from ec2 metadata") metadata.InstanceID, _ = e.GetMetadataInfo(InstanceIDPath) + metadata.InstanceLifeCycle, _ = e.GetMetadataInfo(InstanceLifeCycle) metadata.InstanceType, _ = e.GetMetadataInfo(InstanceTypePath) metadata.LocalIP, _ = e.GetMetadataInfo(LocalIPPath) metadata.AvailabilityZone, _ = e.GetMetadataInfo(AZPlacementPath) diff --git a/pkg/observability/k8s-events.go b/pkg/observability/k8s-events.go index 372cd415..36b7fe9d 100644 --- a/pkg/observability/k8s-events.go +++ b/pkg/observability/k8s-events.go @@ -84,6 +84,7 @@ func InitK8sEventRecorder(enabled bool, nodeName string, nodeMetadata ec2metadat annotations["account-id"] = nodeMetadata.AccountId annotations["availability-zone"] = nodeMetadata.AvailabilityZone annotations["instance-id"] = nodeMetadata.InstanceID + annotations["instance-life-cycle"] = nodeMetadata.InstanceLifeCycle annotations["instance-type"] = nodeMetadata.InstanceType annotations["local-hostname"] = nodeMetadata.LocalHostname annotations["local-ipv4"] = nodeMetadata.LocalIP diff --git a/test/e2e/spot-interruption-test-events-on b/test/e2e/spot-interruption-test-events-on index 2611a11f..ac1b0334 100755 --- a/test/e2e/spot-interruption-test-events-on +++ b/test/e2e/spot-interruption-test-events-on @@ -157,7 +157,7 @@ for i in `seq 1 $TAINT_CHECK_CYCLES`; do eventnotfound=$reason break fi - for ant in account-id availability-zone instance-id instance-type local-hostname local-ipv4 public-hostname public-ipv4 region; do + for ant in account-id availability-zone instance-id instance-life-cycle instance-type local-hostname local-ipv4 public-hostname public-ipv4 region; do if [[ "$(echo $event | jq -r --arg ANT "$ant" '.metadata.annotations[$ANT]')" == "null" ]]; then eventnotfound=$reason annotationnotfound=$ant From 33680e09a4e740087c3b7d0e95d41d3fe05e438b Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Mon, 26 Apr 2021 00:00:52 +0200 Subject: [PATCH 09/17] chore: instance lifecycle is not in identity doc --- pkg/ec2metadata/ec2metadata.go | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pkg/ec2metadata/ec2metadata.go b/pkg/ec2metadata/ec2metadata.go index f41b777d..12ddd330 100644 --- a/pkg/ec2metadata/ec2metadata.go +++ b/pkg/ec2metadata/ec2metadata.go @@ -331,7 +331,6 @@ func (e *Service) GetNodeMetadata() NodeMetadata { if err != nil { log.Warn().Msg("Unable to fetch instance identity document from ec2 metadata") metadata.InstanceID, _ = e.GetMetadataInfo(InstanceIDPath) - metadata.InstanceLifeCycle, _ = e.GetMetadataInfo(InstanceLifeCycle) metadata.InstanceType, _ = e.GetMetadataInfo(InstanceTypePath) metadata.LocalIP, _ = e.GetMetadataInfo(LocalIPPath) metadata.AvailabilityZone, _ = e.GetMetadataInfo(AZPlacementPath) @@ -339,9 +338,10 @@ func (e *Service) GetNodeMetadata() NodeMetadata { metadata.Region = metadata.AvailabilityZone[0 : len(metadata.AvailabilityZone)-1] } } + metadata.InstanceLifeCycle, _ = e.GetMetadataInfo(InstanceLifeCycle) + metadata.LocalHostname, _ = e.GetMetadataInfo(LocalHostnamePath) metadata.PublicHostname, _ = e.GetMetadataInfo(PublicHostnamePath) metadata.PublicIP, _ = e.GetMetadataInfo(PublicIPPath) - metadata.LocalHostname, _ = e.GetMetadataInfo(LocalHostnamePath) log.Info().Interface("metadata", metadata).Msg("Startup Metadata Retrieved") From 4fdaf5d3c6d05ea01b649910da581f71bd7cf317 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Mon, 26 Apr 2021 11:57:28 +0200 Subject: [PATCH 10/17] chore: appease goreportcard --- pkg/ec2metadata/ec2metadata_internal_test.go | 1 + pkg/interruptioneventstore/interruption-event-store.go | 3 +-- pkg/monitor/sqsevent/ec2-state-change-event.go | 4 ++++ pkg/monitor/sqsevent/rebalance-recommendation-event.go | 3 +++ pkg/monitor/sqsevent/spot-itn-event.go | 4 ++++ pkg/monitor/types.go | 3 ++- pkg/monitor/types_test.go | 2 +- 7 files changed, 16 insertions(+), 4 deletions(-) diff --git a/pkg/ec2metadata/ec2metadata_internal_test.go b/pkg/ec2metadata/ec2metadata_internal_test.go index 3a862a3a..1a38faf5 100644 --- a/pkg/ec2metadata/ec2metadata_internal_test.go +++ b/pkg/ec2metadata/ec2metadata_internal_test.go @@ -41,6 +41,7 @@ func TestRetry(t *testing.T) { } resp, err := retry(numRetries, time.Microsecond, request) + h.Assert(t, err != nil, "Should have gotten a \"Request failed\" error") defer resp.Body.Close() h.Equals(t, errorMsg, err.Error()) diff --git a/pkg/interruptioneventstore/interruption-event-store.go b/pkg/interruptioneventstore/interruption-event-store.go index fe9d73a1..b1ea6c36 100644 --- a/pkg/interruptioneventstore/interruption-event-store.go +++ b/pkg/interruptioneventstore/interruption-event-store.go @@ -66,7 +66,6 @@ func (s *Store) AddInterruptionEvent(interruptionEvent *monitor.InterruptionEven if _, ignored := s.ignoredEvents[interruptionEvent.EventID]; !ignored { s.atLeastOneEvent = true } - return } // GetActiveEvent returns true if there are interruption events in the internal store @@ -105,7 +104,7 @@ func (s *Store) shouldEventDrain(interruptionEvent *monitor.InterruptionEvent) b func (s *Store) TimeUntilDrain(interruptionEvent *monitor.InterruptionEvent) time.Duration { nodeTerminationGracePeriod := time.Duration(s.NthConfig.NodeTerminationGracePeriod) * time.Second drainTime := interruptionEvent.StartTime.Add(-1 * nodeTerminationGracePeriod) - return drainTime.Sub(time.Now()) + return time.Until(drainTime) } // MarkAllAsDrained should be called after the node has been drained to prevent further unnecessary drain calls to the k8s api diff --git a/pkg/monitor/sqsevent/ec2-state-change-event.go b/pkg/monitor/sqsevent/ec2-state-change-event.go index 9fa37c8d..0aab24e6 100644 --- a/pkg/monitor/sqsevent/ec2-state-change-event.go +++ b/pkg/monitor/sqsevent/ec2-state-change-event.go @@ -75,6 +75,10 @@ func (m SQSMonitor) ec2StateChangeToInterruptionEvent(event EventBridgeEvent, me InstanceID: ec2StateChangeDetail.InstanceID, Description: fmt.Sprintf("EC2 State Change event received. Instance went into %s at %s \n", ec2StateChangeDetail.State, event.getTime()), } + if err != nil { + return monitor.InterruptionEvent{}, err + } + interruptionEvent.PostDrainTask = func(interruptionEvent monitor.InterruptionEvent, n node.Node) error { errs := m.deleteMessages([]*sqs.Message{message}) if errs != nil { diff --git a/pkg/monitor/sqsevent/rebalance-recommendation-event.go b/pkg/monitor/sqsevent/rebalance-recommendation-event.go index ad06f58c..cbf417da 100644 --- a/pkg/monitor/sqsevent/rebalance-recommendation-event.go +++ b/pkg/monitor/sqsevent/rebalance-recommendation-event.go @@ -58,6 +58,9 @@ func (m SQSMonitor) rebalanceRecommendationToInterruptionEvent(event EventBridge return monitor.InterruptionEvent{}, err } asgName, err := m.retrieveAutoScalingGroupName(rebalanceRecDetail.InstanceID) + if err != nil { + return monitor.InterruptionEvent{}, err + } interruptionEvent := monitor.InterruptionEvent{ EventID: fmt.Sprintf("rebalance-recommendation-event-%x", event.ID), diff --git a/pkg/monitor/sqsevent/spot-itn-event.go b/pkg/monitor/sqsevent/spot-itn-event.go index d5b25fec..a6a173b5 100644 --- a/pkg/monitor/sqsevent/spot-itn-event.go +++ b/pkg/monitor/sqsevent/spot-itn-event.go @@ -60,6 +60,10 @@ func (m SQSMonitor) spotITNTerminationToInterruptionEvent(event EventBridgeEvent return monitor.InterruptionEvent{}, err } asgName, err := m.retrieveAutoScalingGroupName(spotInterruptionDetail.InstanceID) + if err != nil { + return monitor.InterruptionEvent{}, err + } + interruptionEvent := monitor.InterruptionEvent{ EventID: fmt.Sprintf("spot-itn-event-%x", event.ID), Kind: SQSTerminateKind, diff --git a/pkg/monitor/types.go b/pkg/monitor/types.go index 5e52ff95..81f2611a 100644 --- a/pkg/monitor/types.go +++ b/pkg/monitor/types.go @@ -20,6 +20,7 @@ import ( "github.com/aws/aws-node-termination-handler/pkg/node" ) +// DrainTask defines a task to be run when draining a node type DrainTask func(InterruptionEvent, node.Node) error // InterruptionEvent gives more context of the interruption event @@ -43,7 +44,7 @@ type InterruptionEvent struct { // TimeUntilEvent returns the duration until the event start time func (e *InterruptionEvent) TimeUntilEvent() time.Duration { - return e.StartTime.Sub(time.Now()) + return time.Until(e.StartTime) } // IsRebalanceRecommendation returns true if the interruption event is a rebalance recommendation diff --git a/pkg/monitor/types_test.go b/pkg/monitor/types_test.go index b2692850..5e437367 100644 --- a/pkg/monitor/types_test.go +++ b/pkg/monitor/types_test.go @@ -23,7 +23,7 @@ import ( func TestTimeUntilEvent(t *testing.T) { startTime := time.Now().Add(time.Second * 10) - expected := startTime.Sub(time.Now()).Round(time.Second) + expected := time.Until(startTime).Round(time.Second) event := &monitor.InterruptionEvent{ StartTime: startTime, From b8c17d88d07dde3f5e66500f8ac6ae297444bf45 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Mon, 26 Apr 2021 17:01:11 +0200 Subject: [PATCH 11/17] chore: revert and swallow error again --- pkg/monitor/sqsevent/ec2-state-change-event.go | 5 +---- pkg/monitor/sqsevent/rebalance-recommendation-event.go | 6 +----- pkg/monitor/sqsevent/spot-itn-event.go | 6 +----- 3 files changed, 3 insertions(+), 14 deletions(-) diff --git a/pkg/monitor/sqsevent/ec2-state-change-event.go b/pkg/monitor/sqsevent/ec2-state-change-event.go index 0aab24e6..51da3df8 100644 --- a/pkg/monitor/sqsevent/ec2-state-change-event.go +++ b/pkg/monitor/sqsevent/ec2-state-change-event.go @@ -65,7 +65,7 @@ func (m SQSMonitor) ec2StateChangeToInterruptionEvent(event EventBridgeEvent, me if err != nil { return monitor.InterruptionEvent{}, err } - asgName, err := m.retrieveAutoScalingGroupName(ec2StateChangeDetail.InstanceID) + asgName, _ := m.retrieveAutoScalingGroupName(ec2StateChangeDetail.InstanceID) interruptionEvent := monitor.InterruptionEvent{ EventID: fmt.Sprintf("ec2-state-change-event-%x", event.ID), Kind: SQSTerminateKind, @@ -75,9 +75,6 @@ func (m SQSMonitor) ec2StateChangeToInterruptionEvent(event EventBridgeEvent, me InstanceID: ec2StateChangeDetail.InstanceID, Description: fmt.Sprintf("EC2 State Change event received. Instance went into %s at %s \n", ec2StateChangeDetail.State, event.getTime()), } - if err != nil { - return monitor.InterruptionEvent{}, err - } interruptionEvent.PostDrainTask = func(interruptionEvent monitor.InterruptionEvent, n node.Node) error { errs := m.deleteMessages([]*sqs.Message{message}) diff --git a/pkg/monitor/sqsevent/rebalance-recommendation-event.go b/pkg/monitor/sqsevent/rebalance-recommendation-event.go index cbf417da..4493c29d 100644 --- a/pkg/monitor/sqsevent/rebalance-recommendation-event.go +++ b/pkg/monitor/sqsevent/rebalance-recommendation-event.go @@ -57,11 +57,7 @@ func (m SQSMonitor) rebalanceRecommendationToInterruptionEvent(event EventBridge if err != nil { return monitor.InterruptionEvent{}, err } - asgName, err := m.retrieveAutoScalingGroupName(rebalanceRecDetail.InstanceID) - if err != nil { - return monitor.InterruptionEvent{}, err - } - + asgName, _ := m.retrieveAutoScalingGroupName(rebalanceRecDetail.InstanceID) interruptionEvent := monitor.InterruptionEvent{ EventID: fmt.Sprintf("rebalance-recommendation-event-%x", event.ID), Kind: SQSTerminateKind, diff --git a/pkg/monitor/sqsevent/spot-itn-event.go b/pkg/monitor/sqsevent/spot-itn-event.go index a6a173b5..dabc900e 100644 --- a/pkg/monitor/sqsevent/spot-itn-event.go +++ b/pkg/monitor/sqsevent/spot-itn-event.go @@ -59,11 +59,7 @@ func (m SQSMonitor) spotITNTerminationToInterruptionEvent(event EventBridgeEvent if err != nil { return monitor.InterruptionEvent{}, err } - asgName, err := m.retrieveAutoScalingGroupName(spotInterruptionDetail.InstanceID) - if err != nil { - return monitor.InterruptionEvent{}, err - } - + asgName, _ := m.retrieveAutoScalingGroupName(spotInterruptionDetail.InstanceID) interruptionEvent := monitor.InterruptionEvent{ EventID: fmt.Sprintf("spot-itn-event-%x", event.ID), Kind: SQSTerminateKind, From b23a97725ce406698ce81f6c7df0dbf60d6894b6 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Mon, 26 Apr 2021 17:10:41 +0200 Subject: [PATCH 12/17] refactor: reduce cyclomatic complexity --- cmd/node-termination-handler.go | 119 ++++++++++++++++++-------------- pkg/config/config.go | 19 +++-- 2 files changed, 80 insertions(+), 58 deletions(-) diff --git a/cmd/node-termination-handler.go b/cmd/node-termination-handler.go index 91c93caa..be984bc0 100644 --- a/cmd/node-termination-handler.go +++ b/cmd/node-termination-handler.go @@ -296,56 +296,13 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto } drainEvent.NodeLabels = nodeLabels if drainEvent.PreDrainTask != nil { - err := drainEvent.PreDrainTask(*drainEvent, node) - if err != nil { - log.Err(err).Msg("There was a problem executing the pre-drain task") - recorder.Emit(nodeName, observability.Warning, observability.PreDrainErrReason, observability.PreDrainErrMsgFmt, err.Error()) - } else { - recorder.Emit(nodeName, observability.Normal, observability.PreDrainReason, observability.PreDrainMsg) - } - metrics.NodeActionsInc("pre-drain", nodeName, err) + runPreDrainTask(node, nodeName, drainEvent, metrics, recorder) } if nthConfig.CordonOnly || (drainEvent.IsRebalanceRecommendation() && !nthConfig.EnableRebalanceDraining) { - err := node.Cordon(nodeName) - if err != nil { - if errors.IsNotFound(err) { - log.Err(err).Msgf("node '%s' not found in the cluster", nodeName) - } else { - log.Err(err).Msg("There was a problem while trying to cordon the node") - recorder.Emit(nodeName, observability.Warning, observability.CordonErrReason, observability.CordonErrMsgFmt, err.Error()) - os.Exit(1) - } - } else { - log.Info().Str("node_name", nodeName).Msg("Node successfully cordoned") - podNameList, err := node.FetchPodNameList(nodeName) - if err != nil { - log.Err(err).Msgf("Unable to fetch running pods for node '%s' ", nodeName) - } - drainEvent.Pods = podNameList - err = node.LogPods(podNameList, nodeName) - if err != nil { - log.Err(err).Msg("There was a problem while trying to log all pod names on the node") - } - metrics.NodeActionsInc("cordon", nodeName, err) - recorder.Emit(nodeName, observability.Normal, observability.CordonReason, observability.CordonMsg) - } + cordonNode(node, nodeName, drainEvent, metrics, recorder) } else { - err := node.CordonAndDrain(nodeName) - if err != nil { - if errors.IsNotFound(err) { - log.Err(err).Msgf("node '%s' not found in the cluster", nodeName) - } else { - log.Err(err).Msg("There was a problem while trying to cordon and drain the node") - metrics.NodeActionsInc("cordon-and-drain", nodeName, err) - recorder.Emit(nodeName, observability.Warning, observability.CordonAndDrainErrReason, observability.CordonAndDrainErrMsgFmt, err.Error()) - os.Exit(1) - } - } else { - log.Info().Str("node_name", nodeName).Msg("Node successfully cordoned and drained") - metrics.NodeActionsInc("cordon-and-drain", nodeName, err) - recorder.Emit(nodeName, observability.Normal, observability.CordonAndDrainReason, observability.CordonAndDrainMsg) - } + cordonAndDrainNode(node, nodeName, metrics, recorder) } interruptionEventStore.MarkAllAsDrained(nodeName) @@ -353,15 +310,73 @@ func drainOrCordonIfNecessary(interruptionEventStore *interruptioneventstore.Sto webhook.Post(nodeMetadata, drainEvent, nthConfig) } if drainEvent.PostDrainTask != nil { - err := drainEvent.PostDrainTask(*drainEvent, node) + runPostDrainTask(node, nodeName, drainEvent, metrics, recorder) + } + <-interruptionEventStore.Workers + +} + +func runPreDrainTask(node node.Node, nodeName string, drainEvent *monitor.InterruptionEvent, metrics observability.Metrics, recorder observability.K8sEventRecorder) { + err := drainEvent.PreDrainTask(*drainEvent, node) + if err != nil { + log.Err(err).Msg("There was a problem executing the pre-drain task") + recorder.Emit(nodeName, observability.Warning, observability.PreDrainErrReason, observability.PreDrainErrMsgFmt, err.Error()) + } else { + recorder.Emit(nodeName, observability.Normal, observability.PreDrainReason, observability.PreDrainMsg) + } + metrics.NodeActionsInc("pre-drain", nodeName, err) +} + +func cordonNode(node node.Node, nodeName string, drainEvent *monitor.InterruptionEvent, metrics observability.Metrics, recorder observability.K8sEventRecorder) { + err := node.Cordon(nodeName) + if err != nil { + if errors.IsNotFound(err) { + log.Err(err).Msgf("node '%s' not found in the cluster", nodeName) + } else { + log.Err(err).Msg("There was a problem while trying to cordon the node") + recorder.Emit(nodeName, observability.Warning, observability.CordonErrReason, observability.CordonErrMsgFmt, err.Error()) + os.Exit(1) + } + } else { + log.Info().Str("node_name", nodeName).Msg("Node successfully cordoned") + podNameList, err := node.FetchPodNameList(nodeName) if err != nil { - log.Err(err).Msg("There was a problem executing the post-drain task") - recorder.Emit(nodeName, observability.Warning, observability.PostDrainErrReason, observability.PostDrainErrMsgFmt, err.Error()) + log.Err(err).Msgf("Unable to fetch running pods for node '%s' ", nodeName) + } + err = node.LogPods(podNameList, nodeName) + if err != nil { + log.Err(err).Msg("There was a problem while trying to log all pod names on the node") + } + metrics.NodeActionsInc("cordon", nodeName, err) + recorder.Emit(nodeName, observability.Normal, observability.CordonReason, observability.CordonMsg) + } +} + +func cordonAndDrainNode(node node.Node, nodeName string, metrics observability.Metrics, recorder observability.K8sEventRecorder) { + err := node.CordonAndDrain(nodeName) + if err != nil { + if errors.IsNotFound(err) { + log.Err(err).Msgf("node '%s' not found in the cluster", nodeName) } else { - recorder.Emit(nodeName, observability.Normal, observability.PostDrainReason, observability.PostDrainMsg) + log.Err(err).Msg("There was a problem while trying to cordon and drain the node") + metrics.NodeActionsInc("cordon-and-drain", nodeName, err) + recorder.Emit(nodeName, observability.Warning, observability.CordonAndDrainErrReason, observability.CordonAndDrainErrMsgFmt, err.Error()) + os.Exit(1) } - metrics.NodeActionsInc("post-drain", nodeName, err) + } else { + log.Info().Str("node_name", nodeName).Msg("Node successfully cordoned and drained") + metrics.NodeActionsInc("cordon-and-drain", nodeName, err) + recorder.Emit(nodeName, observability.Normal, observability.CordonAndDrainReason, observability.CordonAndDrainMsg) } - <-interruptionEventStore.Workers +} +func runPostDrainTask(node node.Node, nodeName string, drainEvent *monitor.InterruptionEvent, metrics observability.Metrics, recorder observability.K8sEventRecorder) { + err := drainEvent.PostDrainTask(*drainEvent, node) + if err != nil { + log.Err(err).Msg("There was a problem executing the post-drain task") + recorder.Emit(nodeName, observability.Warning, observability.PostDrainErrReason, observability.PostDrainErrMsgFmt, err.Error()) + } else { + recorder.Emit(nodeName, observability.Normal, observability.PostDrainReason, observability.PostDrainMsg) + } + metrics.NodeActionsInc("post-drain", nodeName, err) } diff --git a/pkg/config/config.go b/pkg/config/config.go index 7df5e2d0..e61a4c7b 100644 --- a/pkg/config/config.go +++ b/pkg/config/config.go @@ -221,12 +221,8 @@ func ParseCliArgs() (config Config, err error) { config.PodTerminationGracePeriod = gracePeriod } - switch strings.ToLower(config.LogLevel) { - case "info": - case "debug": - case "error": - default: - return config, fmt.Errorf("Invalid log-level passed: %s Should be one of: info, debug, error", config.LogLevel) + if err := validateLogLevel(strings.ToLower(config.LogLevel)); err != nil { + return config, err } if config.NodeName == "" { @@ -424,3 +420,14 @@ func getRegionFromQueueURL(queueURL string) string { } return "" } + +func validateLogLevel(logLevel string) error { + switch logLevel { + case "info": + case "debug": + case "error": + default: + return fmt.Errorf("Invalid log-level passed: %s Should be one of: info, debug, error", logLevel) + } + return nil +} From 2124b70c8c43597fe19fc2e86e9f61db83c0d7ff Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Mon, 26 Apr 2021 18:39:08 +0200 Subject: [PATCH 13/17] chore: add instance ID to SQS events descriptions --- pkg/monitor/sqsevent/ec2-state-change-event.go | 2 +- pkg/monitor/sqsevent/rebalance-recommendation-event.go | 2 +- pkg/monitor/sqsevent/spot-itn-event.go | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/pkg/monitor/sqsevent/ec2-state-change-event.go b/pkg/monitor/sqsevent/ec2-state-change-event.go index 51da3df8..943180f8 100644 --- a/pkg/monitor/sqsevent/ec2-state-change-event.go +++ b/pkg/monitor/sqsevent/ec2-state-change-event.go @@ -73,7 +73,7 @@ func (m SQSMonitor) ec2StateChangeToInterruptionEvent(event EventBridgeEvent, me NodeName: nodeName, AutoScalingGroupName: asgName, InstanceID: ec2StateChangeDetail.InstanceID, - Description: fmt.Sprintf("EC2 State Change event received. Instance went into %s at %s \n", ec2StateChangeDetail.State, event.getTime()), + Description: fmt.Sprintf("EC2 State Change event received. Instance %s went into %s at %s \n", ec2StateChangeDetail.InstanceID, ec2StateChangeDetail.State, event.getTime()), } interruptionEvent.PostDrainTask = func(interruptionEvent monitor.InterruptionEvent, n node.Node) error { diff --git a/pkg/monitor/sqsevent/rebalance-recommendation-event.go b/pkg/monitor/sqsevent/rebalance-recommendation-event.go index 4493c29d..031b6561 100644 --- a/pkg/monitor/sqsevent/rebalance-recommendation-event.go +++ b/pkg/monitor/sqsevent/rebalance-recommendation-event.go @@ -65,7 +65,7 @@ func (m SQSMonitor) rebalanceRecommendationToInterruptionEvent(event EventBridge StartTime: event.getTime(), NodeName: nodeName, InstanceID: rebalanceRecDetail.InstanceID, - Description: fmt.Sprintf("Rebalance recommendation event received. Instance will be cordoned at %s \n", event.getTime()), + Description: fmt.Sprintf("Rebalance recommendation event received. Instance %s will be cordoned at %s \n", rebalanceRecDetail.InstanceID, event.getTime()), } interruptionEvent.PostDrainTask = func(interruptionEvent monitor.InterruptionEvent, n node.Node) error { errs := m.deleteMessages([]*sqs.Message{message}) diff --git a/pkg/monitor/sqsevent/spot-itn-event.go b/pkg/monitor/sqsevent/spot-itn-event.go index dabc900e..527ca7d1 100644 --- a/pkg/monitor/sqsevent/spot-itn-event.go +++ b/pkg/monitor/sqsevent/spot-itn-event.go @@ -67,7 +67,7 @@ func (m SQSMonitor) spotITNTerminationToInterruptionEvent(event EventBridgeEvent StartTime: event.getTime(), NodeName: nodeName, InstanceID: spotInterruptionDetail.InstanceID, - Description: fmt.Sprintf("Spot Interruption event received. Instance will be interrupted at %s \n", event.getTime()), + Description: fmt.Sprintf("Spot Interruption event received. Instance %s will be interrupted at %s \n", spotInterruptionDetail.InstanceID, event.getTime()), } interruptionEvent.PostDrainTask = func(interruptionEvent monitor.InterruptionEvent, n node.Node) error { errs := m.deleteMessages([]*sqs.Message{message}) From 8c0a2c2d68d94d6515efdc4b419e369a06498f73 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Tue, 27 Apr 2021 22:41:12 +0200 Subject: [PATCH 14/17] revert: validateLogLevel() --- pkg/config/config.go | 19 ++++++------------- 1 file changed, 6 insertions(+), 13 deletions(-) diff --git a/pkg/config/config.go b/pkg/config/config.go index e61a4c7b..7df5e2d0 100644 --- a/pkg/config/config.go +++ b/pkg/config/config.go @@ -221,8 +221,12 @@ func ParseCliArgs() (config Config, err error) { config.PodTerminationGracePeriod = gracePeriod } - if err := validateLogLevel(strings.ToLower(config.LogLevel)); err != nil { - return config, err + switch strings.ToLower(config.LogLevel) { + case "info": + case "debug": + case "error": + default: + return config, fmt.Errorf("Invalid log-level passed: %s Should be one of: info, debug, error", config.LogLevel) } if config.NodeName == "" { @@ -420,14 +424,3 @@ func getRegionFromQueueURL(queueURL string) string { } return "" } - -func validateLogLevel(logLevel string) error { - switch logLevel { - case "info": - case "debug": - case "error": - default: - return fmt.Errorf("Invalid log-level passed: %s Should be one of: info, debug, error", logLevel) - } - return nil -} From a08ac0b9defd32690a44307bbe60459e519075ba Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Tue, 27 Apr 2021 23:53:29 +0200 Subject: [PATCH 15/17] chore: update licenses --- THIRD_PARTY_LICENSES | 2 ++ 1 file changed, 2 insertions(+) diff --git a/THIRD_PARTY_LICENSES b/THIRD_PARTY_LICENSES index 03f95e57..be131428 100644 --- a/THIRD_PARTY_LICENSES +++ b/THIRD_PARTY_LICENSES @@ -41,6 +41,8 @@ https://github.com/kubernetes/utils ** github.com/jmespath/go-jmespath; version v0.3.0 -- https://github.com/jmespath/go-jmespath ** github.com/aws/aws-sdk-go; version v1.33.1 -- https://github.com/aws/aws-sdk-go ** go.opentelemetry.io/otel/exporters/metric/prometheus; version v0.6.0 -- https://github.com/open-telemetry/opentelemetry-go +** github.com/golang/groupcache; version v0.0.0-20210331224755-41bb18bfe9da -- +https://github.com/golang/groupcache Apache License From b381503759aed071381e7e68286e7b6877859ab1 Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Tue, 27 Apr 2021 23:54:00 +0200 Subject: [PATCH 16/17] chore: default annotations in IMDS mode only --- cmd/node-termination-handler.go | 2 +- docs/kubernetes_events.md | 22 +++++++--------------- pkg/observability/k8s-events.go | 24 +++++++++++++----------- 3 files changed, 21 insertions(+), 27 deletions(-) diff --git a/cmd/node-termination-handler.go b/cmd/node-termination-handler.go index be984bc0..79c9ce04 100644 --- a/cmd/node-termination-handler.go +++ b/cmd/node-termination-handler.go @@ -114,7 +114,7 @@ func main() { log.Fatal().Msgf("Unable to find the AWS region to process queue events.") } - recorder, err := observability.InitK8sEventRecorder(nthConfig.EmitKubernetesEvents, nthConfig.NodeName, nodeMetadata, nthConfig.KubernetesEventsExtraAnnotations) + recorder, err := observability.InitK8sEventRecorder(nthConfig.EmitKubernetesEvents, nthConfig.NodeName, nthConfig.EnableSQSTerminationDraining, nodeMetadata, nthConfig.KubernetesEventsExtraAnnotations) if err != nil { nthConfig.Print() log.Fatal().Err(err).Msg("Unable to create Kubernetes event recorder,") diff --git a/docs/kubernetes_events.md b/docs/kubernetes_events.md index 1865d85e..344ad0d1 100644 --- a/docs/kubernetes_events.md +++ b/docs/kubernetes_events.md @@ -6,11 +6,11 @@ AWS Node Termination Handler has the ability to emit a Kubernetes event every ti There are two relevant parameters: -* `emit-kubernetes-events` +* `emit-kubernetes-events`: - If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. Defaults to `false` + If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. Defaults to `false`. -* `kubernetes-events-extra-annotations` +* `kubernetes-events-extra-annotations`: A comma-separated list of `key=value` extra annotations to attach to all emitted Kubernetes events. Example: @@ -41,13 +41,13 @@ Node action reasons: * `UncordonError` * `MonitorError` -## Default annotations +## Default IMDS mode annotations -If `emit-kubernetes-events` is enabled, AWS Node Termination Handler will automatically inject a set of annotations to each event it emits. Such annotations are gathered from the underlying host's IMDS endpoint and enrich each event with information about the host that emitted it. +If `emit-kubernetes-events` is enabled and `enable-sqs-termination-draining` is disabled (meaning we're operating in IMDS mode), AWS Node Termination Handler will automatically inject a set of annotations to each event it emits. Such annotations are gathered from the underlying host's IMDS endpoint and enrich each event with information about the host that emitted it. -_**NOTE**: In Queue Processor mode, these annotations will reflect the node running NTH not the node receiving the events. See [Caveats](https://github.com/aws/aws-node-termination-handler/blob/main/docs/kubernetes_events.md#caveats) for more information._ +_**NOTE**: In Queue Processor mode, the default IMDS mode annotations will be disabled but you can still define a set of extra annotations._ -The default annotations are: +The default IMDS mode annotations are: Name | Example value --- | --- @@ -79,11 +79,3 @@ kubectl get events --field-selector "reason=SpotInterruption,involvedObject.name ``` Results can also be printed out in JSON or YAML format and piped to processors like `jq` or `yq`. Then, the above annotations can also be used for discovery and filtering. - -## Caveats - -### Default annotations in Queue Processor Mode - -Default annotations values are gathered from the IMDS endpoint local to the Node on which AWS Node Termination Handler runs. This is fine when running on IMDS Processor Mode since an AWS Node Termination Handler Pod will be deployed to all Nodes via a `DaemonSet` and each Node will emit all events related to itself with its own default annotations. - -However, when running in Queue Processor Mode AWS Node Termination Handler is deployed to a number of Nodes (1 replica by default) via a `Deployment`. In that case the default annotations values will be gathered from the Node(s) running AWS Node Termination Handler, and so the values in the default annotations stamped to all events will match those of the Node from which the event was emitted, not those of the Node of which the event is about. diff --git a/pkg/observability/k8s-events.go b/pkg/observability/k8s-events.go index 36b7fe9d..60f8cba9 100644 --- a/pkg/observability/k8s-events.go +++ b/pkg/observability/k8s-events.go @@ -75,22 +75,24 @@ type K8sEventRecorder struct { } // InitK8sEventRecorder creates a Kubernetes event recorder -func InitK8sEventRecorder(enabled bool, nodeName string, nodeMetadata ec2metadata.NodeMetadata, extraAnnotationsStr string) (K8sEventRecorder, error) { +func InitK8sEventRecorder(enabled bool, nodeName string, sqsMode bool, nodeMetadata ec2metadata.NodeMetadata, extraAnnotationsStr string) (K8sEventRecorder, error) { if !enabled { return K8sEventRecorder{}, nil } annotations := make(map[string]string) - annotations["account-id"] = nodeMetadata.AccountId - annotations["availability-zone"] = nodeMetadata.AvailabilityZone - annotations["instance-id"] = nodeMetadata.InstanceID - annotations["instance-life-cycle"] = nodeMetadata.InstanceLifeCycle - annotations["instance-type"] = nodeMetadata.InstanceType - annotations["local-hostname"] = nodeMetadata.LocalHostname - annotations["local-ipv4"] = nodeMetadata.LocalIP - annotations["public-hostname"] = nodeMetadata.PublicHostname - annotations["public-ipv4"] = nodeMetadata.PublicIP - annotations["region"] = nodeMetadata.Region + if !sqsMode { + annotations["account-id"] = nodeMetadata.AccountId + annotations["availability-zone"] = nodeMetadata.AvailabilityZone + annotations["instance-id"] = nodeMetadata.InstanceID + annotations["instance-life-cycle"] = nodeMetadata.InstanceLifeCycle + annotations["instance-type"] = nodeMetadata.InstanceType + annotations["local-hostname"] = nodeMetadata.LocalHostname + annotations["local-ipv4"] = nodeMetadata.LocalIP + annotations["public-hostname"] = nodeMetadata.PublicHostname + annotations["public-ipv4"] = nodeMetadata.PublicIP + annotations["region"] = nodeMetadata.Region + } var err error if extraAnnotationsStr != "" { From 3fcce5fa7e72ffdd47b3bcd3c58b3c0b80a26e0d Mon Sep 17 00:00:00 2001 From: Roger Torrentsgeneros Date: Wed, 28 Apr 2021 09:57:39 +0200 Subject: [PATCH 17/17] chore: update docs about IMDS annotations --- config/helm/aws-node-termination-handler/README.md | 8 ++++---- config/helm/aws-node-termination-handler/values.yaml | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/config/helm/aws-node-termination-handler/README.md b/config/helm/aws-node-termination-handler/README.md index b8e9abf3..d07ca3c3 100644 --- a/config/helm/aws-node-termination-handler/README.md +++ b/config/helm/aws-node-termination-handler/README.md @@ -80,13 +80,13 @@ Parameter | Description | Default `enableProbesServer` | If true, start an http server exposing `/healthz` endpoint for probes. | `false` `probesServerPort` | Replaces the default HTTP port for exposing probes endpoint. | `8080` `probesServerEndpoint` | Replaces the default endpoint for exposing probes endpoint. | `/healthz` -`podMonitor.create` | if `true`, create a PodMonitor | `false` +`podMonitor.create` | If `true`, create a PodMonitor | `false` `podMonitor.interval` | Prometheus scrape interval | `30s` `podMonitor.sampleLimit` | Number of scraped samples accepted | `5000` `podMonitor.labels` | Additional PodMonitor metadata labels | `{}` -`podMonitor.namespace` | override podMonitor Helm release namespace | `{{ .Release.Namespace }}` -`emitKubernetesEvents` | If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. A default set of annotations with all the node metadata gathered from IMDS will be attached to each event. More information [here](https://github.com/aws/aws-node-termination-handler/blob/main/docs/kubernetes_events.md) | `false` -`kubernetesExtraEventsAnnotations` | A comma-separated list of key=value extra annotations to attach to all emitted Kubernetes events. Example: `first=annotation,sample.annotation/number=two"` | None +`podMonitor.namespace` | Override podMonitor Helm release namespace | `{{ .Release.Namespace }}` +`emitKubernetesEvents` | If `true`, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. In IMDS Processor mode a default set of annotations with all the node metadata gathered from IMDS will be attached to each event. More information [here](https://github.com/aws/aws-node-termination-handler/blob/main/docs/kubernetes_events.md) | `false` +`kubernetesExtraEventsAnnotations` | A comma-separated list of `key=value` extra annotations to attach to all emitted Kubernetes events. Example: `first=annotation,sample.annotation/number=two"` | None ### AWS Node Termination Handler - Queue-Processor Mode Configuration diff --git a/config/helm/aws-node-termination-handler/values.yaml b/config/helm/aws-node-termination-handler/values.yaml index d2f47694..6063e3cb 100644 --- a/config/helm/aws-node-termination-handler/values.yaml +++ b/config/helm/aws-node-termination-handler/values.yaml @@ -159,7 +159,7 @@ enableProbesServer: false probesServerPort: 8080 probesServerEndpoint: "/healthz" -# emitKubernetesEvents If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. A default set of annotations with all the node metadata gathered from IMDS will be attached to each event +# emitKubernetesEvents If true, Kubernetes events will be emitted when interruption events are received and when actions are taken on Kubernetes nodes. In IMDS Processor mode a default set of annotations with all the node metadata gathered from IMDS will be attached to each event emitKubernetesEvents: false # kubernetesEventsExtraAnnotations A comma-separated list of key=value extra annotations to attach to all emitted Kubernetes events