Skip to content

Commit bd090e0

Browse files
yash97sushrkorsenthilGnatorXdependabot[bot]
authored
Updating Release 1.6 branch. (#494)
* remove global exclusion for G108,G114 and add nosec in code (#404) * Update controller_auth_proxy_patch.yaml (#405) Update the reference from gcr.io to registry.k8s.io > kube-rbac-proxy is moving to registry.k8s.io/kubebuilder/kube-rbac-proxy (from gcr.io/kubebuilder/kube-rbac-proxy) because GCR is being sunset. We need to update these references. * Fix log which causes panic (#407) * Fix log which causes panic * Consistent key name * consistent naming * updating ginkgo and gomega * Bump github.com/prometheus/common from 0.51.1 to 0.53.0 Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.51.1 to 0.53.0. - [Release notes](https://github.com/prometheus/common/releases) - [Commits](prometheus/common@v0.51.1...v0.53.0) --- updated-dependencies: - dependency-name: github.com/prometheus/common dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Bump github.com/prometheus/client_model from 0.6.0 to 0.6.1 (#432) Bumps [github.com/prometheus/client_model](https://github.com/prometheus/client_model) from 0.6.0 to 0.6.1. - [Release notes](https://github.com/prometheus/client_model/releases) - [Commits](prometheus/client_model@v0.6.0...v0.6.1) --- updated-dependencies: - dependency-name: github.com/prometheus/client_model dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump github.com/onsi/ginkgo/v2 from 2.17.2 to 2.19.0 (#431) Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.17.2 to 2.19.0. - [Release notes](https://github.com/onsi/ginkgo/releases) - [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md) - [Commits](onsi/ginkgo@v2.17.2...v2.19.0) --- updated-dependencies: - dependency-name: github.com/onsi/ginkgo/v2 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * QPS and busrt adjustment (#436) * readme update for events (#453) * Set controller user-agent to vpc-resource-controller/git-version (#455) * update user-agent string. * Use AppName instead of ControllerName. * Add security group pods scale test in ginkgo (#457) * Add security group pods scale test in ginkgo * Add instructions to run scale tests manually * fix typo in README * Passing page limit to cach config instead of override. (#452) * passing page limit to cache config * adding error log to optimized list watcher * importing vpc pkg * pods will requeue for reconcile if nodes are not managed and requested eni (#463) * pod will requeue for reconcile if nodes are not managed and requested eni * log statement change * looping through all container for eni requests * adding ut for utils function * add CNINode integration tests (#479) * add CNINode integration tests * address PR comments * updating log statements * add retry in VerifyCNINode * Bump go.uber.org/zap from 1.26.0 to 1.27.0 (#480) Bumps [go.uber.org/zap](https://github.com/uber-go/zap) from 1.26.0 to 1.27.0. - [Release notes](https://github.com/uber-go/zap/releases) - [Changelog](https://github.com/uber-go/zap/blob/master/CHANGELOG.md) - [Commits](uber-go/zap@v1.26.0...v1.27.0) --- updated-dependencies: - dependency-name: go.uber.org/zap dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * increasing timeout for few integration test (#486) * Skipping health check on nodes if EC2 returns throttling errors (#485) * updating limits.go for supported ec2 instance type #491 * Bump github.com/samber/lo from 1.39.0 to 1.47.0 (#481) Bumps [github.com/samber/lo](https://github.com/samber/lo) from 1.39.0 to 1.47.0. - [Release notes](https://github.com/samber/lo/releases) - [Commits](samber/lo@v1.39.0...v1.47.0) --- updated-dependencies: - dependency-name: github.com/samber/lo dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Sushmitha Ravikumar <58063229+sushrk@users.noreply.github.com> Co-authored-by: Senthil Kumaran <senthilx@amazon.com> Co-authored-by: Garvin Pang <garvinpang@protonmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Hao Zhou <haouc@users.noreply.github.com>
1 parent 88956b9 commit bd090e0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+1420
-172
lines changed

.github/workflows/presubmit.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -67,5 +67,5 @@ jobs:
6767
- name: Install `gosec`
6868
run: go install github.com/securego/gosec/v2/cmd/gosec@latest
6969
- name: Run Gosec Security Scanner
70-
run: ~/go/bin/gosec -exclude-dir test -exclude-generated -severity medium -exclude=G108,G114 ./...
70+
run: ~/go/bin/gosec -exclude-dir test -exclude-generated -severity medium ./...
7171

README.md

+10-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,16 @@
88

99
Controller running on EKS Control Plane for managing Branch & Trunk Network Interface for [Kubernetes Pod](https://kubernetes.io/docs/concepts/workloads/pods/) using the [Security Group for Pod](https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html) feature and IPv4 Address Management(IPAM) of [Windows Nodes](https://docs.aws.amazon.com/eks/latest/userguide/windows-support.html).
1010

11-
The controller broadcasts its version to nodes. Describing any node will provide the version information in node `Events`. The mapping between the controller's version and the cluster's platform version is also available in release notes.
11+
The controller broadcasts its version to nodes. Describing any node will provide the version information in node `Events`. The mapping between the controller's version and the cluster's platform version is also available in release notes. Please be aware that kubernetes events last for one hour in general and you may have to check the version information events in newly created nodes.
12+
13+
Version events example:
14+
```
15+
Events:
16+
Type Reason Age From Message
17+
---- ------ ---- ---- -------
18+
Normal ControllerVersionNotice 2m58s vpc-resource-controller The node is managed by VPC resource controller version v1.4.9
19+
Normal NodeTrunkInitiated 2m55s vpc-resource-controller The node has trunk interface initialized successfully
20+
```
1221

1322
## Security Group for Pods
1423

config/default/controller_auth_proxy_patch.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ spec:
1010
spec:
1111
containers:
1212
- name: kube-rbac-proxy
13-
image: gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0
13+
image: registry.k8s.io/kubebuilder/kube-rbac-proxy:v0.5.0
1414
args:
1515
- "--secure-listen-address=0.0.0.0:8443"
1616
- "--upstream=http://127.0.0.1:8080/"

controllers/core/node_controller.go

+6
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,12 @@ func (r *NodeReconciler) Check() healthz.Checker {
168168
return nil
169169
}
170170

171+
if r.Manager.SkipHealthCheck() {
172+
// node manager observes EC2 error on processing node, pausing reconciler check to avoid stressing the system
173+
r.Log.Info("due to EC2 error, node controller skips node reconciler health check for now")
174+
return nil
175+
}
176+
171177
err := rcHealthz.PingWithTimeout(func(c chan<- error) {
172178
// when the reconciler is ready, testing the reconciler with a fake node request
173179
pingRequest := &ctrl.Request{

controllers/core/pod_controller.go

+6-1
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ import (
2727
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/node"
2828
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/node/manager"
2929
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/resource"
30+
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/utils"
3031
"github.com/google/uuid"
3132

3233
"github.com/go-logr/logr"
@@ -56,7 +57,7 @@ type PodReconciler struct {
5657

5758
var (
5859
PodRequeueRequest = ctrl.Result{Requeue: true, RequeueAfter: time.Second}
59-
MaxPodConcurrentReconciles = 10
60+
MaxPodConcurrentReconciles = 20
6061
)
6162

6263
// Reconcile handles create/update/delete event by delegating the request to the handler
@@ -112,6 +113,10 @@ func (r *PodReconciler) Reconcile(request custom.Request) (ctrl.Result, error) {
112113
logger.V(1).Info("pod's node is not yet initialized by the manager, will retry", "Requested", request.NamespacedName.String(), "Cached pod name", pod.ObjectMeta.Name, "Cached pod namespace", pod.ObjectMeta.Namespace)
113114
return PodRequeueRequest, nil
114115
} else if !node.IsManaged() {
116+
if utils.PodHasENIRequest(pod) {
117+
r.Log.Info("pod's node is not managed, but has eni request, will retry", "Requested", request.NamespacedName.String(), "Cached pod name", pod.ObjectMeta.Name, "Cached pod namespace", pod.ObjectMeta.Namespace)
118+
return PodRequeueRequest, nil
119+
}
115120
logger.V(1).Info("pod's node is not managed, skipping pod event", "Requested", request.NamespacedName.String(), "Cached pod name", pod.ObjectMeta.Name, "Cached pod namespace", pod.ObjectMeta.Namespace)
116121
return ctrl.Result{}, nil
117122
} else if !node.IsReady() {

controllers/core/pod_controller_test.go

+2-1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ package controllers
1616
import (
1717
"errors"
1818
"testing"
19+
"time"
1920

2021
"github.com/aws/amazon-vpc-resource-controller-k8s/controllers/custom"
2122
mock_condition "github.com/aws/amazon-vpc-resource-controller-k8s/mocks/amazon-vcp-resource-controller-k8s/pkg/condition"
@@ -188,7 +189,7 @@ func TestPodReconciler_Reconcile_NonManaged(t *testing.T) {
188189

189190
result, err := mock.PodReconciler.Reconcile(mockReq)
190191
assert.NoError(t, err)
191-
assert.Equal(t, result, controllerruntime.Result{})
192+
assert.Equal(t, controllerruntime.Result{Requeue: true, RequeueAfter: time.Second}, result)
192193
}
193194

194195
// TestPodReconciler_Reconcile_NoNodeAssigned tests that the request for a Pod with no Node assigned

controllers/custom/builder.go

+6-5
Original file line numberDiff line numberDiff line change
@@ -113,15 +113,16 @@ func (b *Builder) Complete(reconciler Reconciler) (healthz.Checker, error) {
113113
workqueue.DefaultControllerRateLimiter(), b.options.Name)
114114

115115
optimizedListWatch := newOptimizedListWatcher(b.ctx, b.clientSet.CoreV1().RESTClient(),
116-
b.converter.Resource(), b.options.Namespace, b.options.PageLimit, b.converter)
116+
b.converter.Resource(), b.options.Namespace, b.converter, b.log.WithName("listWatcher"))
117117

118118
// Create the config for low level controller with the custom converter
119119
// list and watch
120120
config := &cache.Config{
121-
Queue: cache.NewDeltaFIFO(b.converter.Indexer, b.dataStore),
122-
ListerWatcher: optimizedListWatch,
123-
ObjectType: b.converter.ResourceType(),
124-
FullResyncPeriod: b.options.ResyncPeriod,
121+
Queue: cache.NewDeltaFIFO(b.converter.Indexer, b.dataStore),
122+
ListerWatcher: optimizedListWatch,
123+
WatchListPageSize: int64(b.options.PageLimit),
124+
ObjectType: b.converter.ResourceType(),
125+
FullResyncPeriod: b.options.ResyncPeriod,
125126
Process: func(obj interface{}, _ bool) error {
126127
// from oldest to newest
127128
for _, d := range obj.(cache.Deltas) {

controllers/custom/custom_controller.go

+20-7
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ import (
2121

2222
"github.com/aws/amazon-vpc-resource-controller-k8s/pkg/condition"
2323
"github.com/go-logr/logr"
24+
apierrors "k8s.io/apimachinery/pkg/api/errors"
2425
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
2526
"k8s.io/apimachinery/pkg/runtime"
2627
"k8s.io/apimachinery/pkg/types"
@@ -178,23 +179,26 @@ func (c *CustomController) WaitForCacheSync(controller cache.Controller) {
178179

179180
// newOptimizedListWatcher returns a list watcher with a custom list function that converts the
180181
// response for each page using the converter function and returns a general watcher
181-
func newOptimizedListWatcher(ctx context.Context, restClient cache.Getter, resource string, namespace string, limit int,
182-
converter Converter) *cache.ListWatch {
182+
func newOptimizedListWatcher(ctx context.Context, restClient cache.Getter, resource string, namespace string,
183+
converter Converter, log logr.Logger) *cache.ListWatch {
183184

184185
listFunc := func(options metav1.ListOptions) (runtime.Object, error) {
185186
list, err := restClient.Get().
186187
Namespace(namespace).
187188
Resource(resource).
188-
// This needs to be done because just setting the limit using option's
189-
// Limit is being overridden and the response is returned without pagination.
190189
VersionedParams(&metav1.ListOptions{
191-
Limit: int64(limit),
190+
Limit: options.Limit,
192191
Continue: options.Continue,
193192
}, metav1.ParameterCodec).
194193
Do(ctx).
195194
Get()
196195
if err != nil {
197-
return list, err
196+
if statusErr, ok := err.(*apierrors.StatusError); ok {
197+
log.Error(err, "List operation error", "code", statusErr.Status().Code)
198+
} else {
199+
log.Error(err, "List operation error")
200+
}
201+
return nil, err
198202
}
199203
// Strip down the the list before passing the paginated response back to
200204
// the pager function
@@ -206,11 +210,20 @@ func newOptimizedListWatcher(ctx context.Context, restClient cache.Getter, resou
206210
// before storing the object in the data store.
207211
watchFunc := func(options metav1.ListOptions) (watch.Interface, error) {
208212
options.Watch = true
209-
return restClient.Get().
213+
watch, err := restClient.Get().
210214
Namespace(namespace).
211215
Resource(resource).
212216
VersionedParams(&options, metav1.ParameterCodec).
213217
Watch(ctx)
218+
if err != nil {
219+
if statusErr, ok := err.(*apierrors.StatusError); ok {
220+
log.Error(err, "Watch operation error", "code", statusErr.Status().Code)
221+
} else {
222+
log.Error(err, "Watch operation error")
223+
}
224+
return nil, err
225+
}
226+
return watch, err
214227
}
215228
return &cache.ListWatch{ListFunc: listFunc, WatchFunc: watchFunc}
216229
}

go.mod

+13-13
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,14 @@ require (
99
github.com/go-logr/zapr v1.3.0
1010
github.com/golang/mock v1.6.0
1111
github.com/google/uuid v1.6.0
12-
github.com/onsi/ginkgo/v2 v2.17.1
13-
github.com/onsi/gomega v1.31.1
12+
github.com/onsi/ginkgo/v2 v2.19.0
13+
github.com/onsi/gomega v1.33.1
1414
github.com/pkg/errors v0.9.1
1515
github.com/prometheus/client_golang v1.19.0
16-
github.com/prometheus/client_model v0.6.0
17-
github.com/prometheus/common v0.52.2
16+
github.com/prometheus/client_model v0.6.1
17+
github.com/prometheus/common v0.53.0
1818
github.com/stretchr/testify v1.9.0
19-
go.uber.org/zap v1.26.0
19+
go.uber.org/zap v1.27.0
2020
golang.org/x/time v0.5.0
2121
gomodules.xyz/jsonpatch/v2 v2.4.0
2222
k8s.io/api v0.29.3
@@ -26,6 +26,7 @@ require (
2626
)
2727

2828
require (
29+
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
2930
github.com/google/gnostic-models v0.6.9-0.20230804172637-c7be7c783f49 // indirect
3031
github.com/gorilla/websocket v1.5.0 // indirect
3132
github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f // indirect
@@ -42,13 +43,12 @@ require (
4243
github.com/go-openapi/jsonpointer v0.19.6 // indirect
4344
github.com/go-openapi/jsonreference v0.20.2 // indirect
4445
github.com/go-openapi/swag v0.22.3 // indirect
45-
github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572 // indirect
4646
github.com/gogo/protobuf v1.3.2 // indirect
4747
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
4848
github.com/golang/protobuf v1.5.4 // indirect
4949
github.com/google/go-cmp v0.6.0 // indirect
5050
github.com/google/gofuzz v1.2.0 // indirect
51-
github.com/google/pprof v0.0.0-20230323073829-e72429f035bd // indirect
51+
github.com/google/pprof v0.0.0-20240424215950-a892ee059fd6 // indirect
5252
github.com/imdario/mergo v0.3.13 // indirect
5353
github.com/jmespath/go-jmespath v0.4.0 // indirect
5454
github.com/josharian/intern v1.0.0 // indirect
@@ -60,16 +60,16 @@ require (
6060
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
6161
github.com/pmezard/go-difflib v1.0.0 // indirect
6262
github.com/prometheus/procfs v0.12.0 // indirect
63-
github.com/samber/lo v1.39.0
63+
github.com/samber/lo v1.47.0
6464
github.com/spf13/pflag v1.0.5 // indirect
6565
go.uber.org/multierr v1.11.0 // indirect
6666
golang.org/x/exp v0.0.0-20231006140011-7918f672742d
67-
golang.org/x/net v0.23.0 // indirect
67+
golang.org/x/net v0.25.0 // indirect
6868
golang.org/x/oauth2 v0.18.0 // indirect
69-
golang.org/x/sys v0.18.0 // indirect
70-
golang.org/x/term v0.18.0 // indirect
71-
golang.org/x/text v0.14.0 // indirect
72-
golang.org/x/tools v0.17.0 // indirect
69+
golang.org/x/sys v0.20.0 // indirect
70+
golang.org/x/term v0.20.0 // indirect
71+
golang.org/x/text v0.16.0 // indirect
72+
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d // indirect
7373
google.golang.org/appengine v1.6.8 // indirect
7474
google.golang.org/protobuf v1.33.0 // indirect
7575
gopkg.in/inf.v0 v0.9.1 // indirect

0 commit comments

Comments
 (0)