Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple k8s-registries lead to NSE flapping #6779

Closed
zolug opened this issue Jul 18, 2022 · 7 comments · Fixed by networkservicemesh/sdk-k8s#385
Closed

multiple k8s-registries lead to NSE flapping #6779

zolug opened this issue Jul 18, 2022 · 7 comments · Fixed by networkservicemesh/sdk-k8s#385
Assignees
Labels
bug Something isn't working question Further information is requested stability The problem is related to system stability
Milestone

Comments

@zolug
Copy link

zolug commented Jul 18, 2022

When deploying NSM using k8s-registry and running multiple registry instances, the NSE re-registration request for a particular NSE might end up on another registry over time. This makes the old registry to falsely assume the NSE had disappeared (no re-reg before expiration time), thus removes the associated Custom Resource. This leads to xconnect disturbances.

It appears that the underlying TCP connections backing gRPC keep changing as well.
Apparently dialNSEClient re-dials the same clientURL even if the cached dialer of the NSE has a connection (closing the old connection). Is that intentional?
https://github.com/networkservicemesh/sdk/blob/b8209fc0b7b6a0be3eb709bb7323ea6f1bb56477/pkg/registry/common/dial/nse_client.go#L75

Changing the registry service to provide session affinity based on the client IPs seemingly fixes the problem:

Name:                     nsm-registry-svc
Namespace:                nsm
Labels:                   app.kubernetes.io/managed-by=Helm
Annotations:              meta.helm.sh/release-name: nsm-k8s-1658158431
                          meta.helm.sh/release-namespace: nsm
Selector:                 app=registry
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.96.64.183
IPs:                      10.96.64.183
Port:                     nsm-registry-svc  5002/TCP
TargetPort:               5002/TCP
NodePort:                 nsm-registry-svc  31103/TCP
Endpoints:                10.244.1.2:5002,10.244.2.8:5002
Session Affinity:         ClientIP
External Traffic Policy:  Cluster
Events:                   <none>

But that might not be a lasting solution, as session affinity seemingly has a timeout value. (Maybe also depends on the mode kube-proxy was started in.)
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#clientipconfig-v1-core

I wonder what are the recommendations to provide resiliency and availability through k8s registries?

Reproduction:
-Deploy NSM with multiple (e.g. 2) k8s registries
-Start up mutiple NSEs. For example simply deploying kernel NSE and scaling it up to 8-10 replicas should do it.
-Monitor availability of NSE Custom Resources and their age

Note: Remote VLAN NSE is also affected (which bypasses NSMgr and directly contacts a registry).

NSM version: 1.4.0
Kind (2 workers + controller): kindest/node:v1.22.9@sha256:ad5b8404c4052781365a4e70bb7d17c5331e4177bd4a7cd214339316cd6193b6

@zolug zolug added bug Something isn't working question Further information is requested stability The problem is related to system stability labels Jul 18, 2022
@zolug
Copy link
Author

zolug commented Jul 19, 2022

Made a quick test where I modified the Register function of dialNSEClient, so that it will only call Dial if there's no ClientConn or its state is not Ready. The register reshuffles seem to have gone with it.

func (c *dialNSEClient) Register(ctx context.Context, in *registry.NetworkServiceEndpoint, opts ...grpc.CallOption) (*registry.NetworkServiceEndpoint, error) {
	closeContextFunc := postpone.ContextWithValues(ctx)
	// If no clientURL, we have no work to do
	// call the next in the chain
	clientURL := clienturlctx.ClientURL(ctx)
	if clientURL == nil {
		return next.NetworkServiceEndpointRegistryClient(ctx).Register(ctx, in, opts...)
	}

	cc, _ := clientconn.LoadOrStore(ctx, newDialer(c.chainCtx, c.dialTimeout, c.dialOptions...))

	// If there's an existing grpc.ClientConnInterface and it's not ours, call the next in the chain
	di, ok := cc.(*dialer)
	if !ok {
		return next.NetworkServiceEndpointRegistryClient(ctx).Register(ctx, in, opts...)
	}

	// If our existing dialer has a different URL close down the chain
	if di.clientURL != nil && di.clientURL.String() != clientURL.String() {
		closeCtx, closeCancel := closeContextFunc()
		defer closeCancel()
		err := di.Dial(closeCtx, di.clientURL)
		if err != nil {
			log.FromContext(ctx).Errorf("can not redial to %v, err %v. Deleting clientconn...", grpcutils.URLToTarget(di.clientURL), err)
			clientconn.Delete(ctx)
			return nil, err
		}
		_, _ = next.NetworkServiceEndpointRegistryClient(ctx).Unregister(clienturlctx.WithClientURL(closeCtx, di.clientURL), in, opts...)
		
	}
	
	if di.ClientConn == nil || di.ClientConn.GetState() != connectivity.Ready { // XXX: maybe also check grpc.WithBlock() option along with ready state
		err := di.Dial(ctx, clientURL)
		if err != nil {
			log.FromContext(ctx).Errorf("can not dial to %v, err %v. Deleting clientconn...", grpcutils.URLToTarget(clientURL), err)
			clientconn.Delete(ctx)
			return nil, err
		}
	}

	conn, err := next.NetworkServiceEndpointRegistryClient(ctx).Register(ctx, in, opts...)
	if err != nil {
		_ = di.Close()
		return nil, err
	}
	return conn, nil
}

@denis-tingaikin
Copy link
Member

@NikitaSkrynnik
Copy link
Collaborator

NikitaSkrynnik commented Jul 21, 2022

@zolug Hello! Endpoint can go to different registries. We don't cache connections because their TLS certs can expire and gRPC doesn't update them automatically.

If registry doesn't get re-registration for particular NSE and NSE expires, registry deletes this NSE from memory and also deletes all assosiated Custom Resourses. At the moment we don't consider scenarios with multi-registry so there might be a lot of instabilities.

@denis-tingaikin
Copy link
Member

denis-tingaikin commented Jul 21, 2022

@zolug , @NikitaSkrynnik

We've a bit discussed with Ed the problem.

Notes from the disccustion:

  1. Multi registries is goodness. It could improve NSM resiliency and availability.
  2. Planned to consider the issue in v1.6.0 (because it requires more investigations)
  3. Prefer to consider a solution with changing expire element in the registry.

@zolug At this moment I wouldn't recommend to use multi registry setup.

@denis-tingaikin denis-tingaikin modified the milestones: v1.5.0, v1.6.0 Jul 21, 2022
@zolug
Copy link
Author

zolug commented Jul 22, 2022

Hi! I see, thanks for the information. We will change our setup accordingly.

@LionelJouin LionelJouin moved this to 📋 To Do in Meridio Jul 25, 2022
denis-tingaikin added a commit to denis-tingaikin/sdk-k8s that referenced this issue Sep 4, 2022
Signed-off-by: Denis Tingaikin <denis.tingajkin@xored.com>
@denis-tingaikin
Copy link
Member

@zolug The problem is solved. Multiple k8s registries can be used now with NSM setup. Note: memory registry can not be scaled due to storage type 'in memory'.

@zolug
Copy link
Author

zolug commented Sep 13, 2022

@denis-tingaikin Hi! Thanks a lot, I will take a look at the changes and will try to play around with it as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested stability The problem is related to system stability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants