Feature Request: EKS service discovery #125

omerlh · 2019-10-28T10:56:44Z

Tell us about your request
Automatically create virtual nodes for each service deployed on EKS or import existing services from EKS.
As part of this, it will be good to support Kubernetes CRDS (maybe using SMI?) to configure routes to the services.

Which integration(s) is this request for?
EKS/Kubernetes

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Make it easier to integrate EKS with AppMesh - today there is a lot of manual configuration you need to do that can be configured automatically or using Kubernetes CRDs files.

Are you currently working around this issue?
Manually configurations on the UI - cumbersome and slow, make it harder to adopt

Additional context
Anything else we should know?

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

stefanprodan · 2019-10-28T19:56:23Z

I think this is feasible after addressing other issues.

A Kubernetes service exposing multiple ports can't be mirrored in App Mesh because virtual nodes and routers don't support multiple listeners #120. With virtual routers you can't represent Kubernetes targetPort since there is no support for port to port mapping.

For an app to talk to another, the destination virtual service must be added as a backend on the caller virtual node, for this to be automated, the discovery component has to create virtual services and fill the backends on every single virtual node with all the virtual services in the mesh.

Multiple EKS clusters can be part of the same mesh so all virtual objects must be unique, doing so would break Kubernetes name resolution since apps can't be addressed with <name>.<namespace>.svc.cluster.local inside the cluster. If you assume that a mesh is dedicated to a single cluster then the discovery component should generate multiple virtual services, one for each Kubernetes resolution e.g. <name>.<namespace>, <name>.<namespace>.svc, <name>.<namespace>.svc.cluster.local.

Linkerd/SMI doesn't have these issues because in Linkerd the virtual node equivalent is the Kubernetes service. In SMI, two apps in the same namespace can address themselves with just the ClusterIP name while in App Mesh you need at least the namespace prefix to avoid name collisions.

kiranmeduri · 2019-12-20T16:59:32Z

I think this is feasible after addressing other issues.

A Kubernetes service exposing multiple ports can't be mirrored in App Mesh because virtual nodes and routers don't support multiple listeners #120. With virtual routers you can't represent Kubernetes targetPort since there is no support for port to port mapping.

Is it reasonable UX to allow single-port services along this experience and backport multiple port support when App Mesh is ready? How common is the pattern to expose multiple ports? This will be useful to build some urgency around multiple port support.
@stefanprodan can you expand on targetPort issue. IIUC we can set port to be match the listener on virtual-router while targetPort is the port on virtual-node. It is allowed to have different ports on virtual-router and target virtual-node.

For an app to talk to another, the destination virtual service must be added as a backend on the caller virtual node, for this to be automated, the discovery component has to create virtual services and fill the backends on every single virtual node with all the virtual services in the mesh.

This is true and is a blocker unless controller automatically adds every service in the cluster as backend. This is not possible as there is limit on number of backends (25 as of 12/20). This is tracked under #113

Multiple EKS clusters can be part of the same mesh so all virtual objects must be unique, doing so would break Kubernetes name resolution since apps can't be addressed with <name>.<namespace>.svc.cluster.local inside the cluster. If you assume that a mesh is dedicated to a single cluster then the discovery component should generate multiple virtual services, one for each Kubernetes resolution e.g. <name>.<namespace>, <name>.<namespace>.svc, <name>.<namespace>.svc.cluster.local.

It should be possible to use AWS Cloud Map integration to alleviate this problem. See https://github.com/aws/aws-app-mesh-examples/tree/master/walkthroughs/howto-k8s-cloudmap for an example.

Linkerd/SMI doesn't have these issues because in Linkerd the virtual node equivalent is the Kubernetes service. In SMI, two apps in the same namespace can address themselves with just the ClusterIP name while in App Mesh you need at least the namespace prefix to avoid name collisions.

fblgit · 2021-04-02T18:24:27Z

This is just a no-go-mesh.
It cant detect by itself who is in the mesh already, by default no traffic goes thru the mesh if its not specified in the backends, it doesn't allow wildcards..
The backends is limited to 25.. how you can even introduce this element in a fleet without these "a-must" features ?
So to be more precise if I want to adopt AWS AppMesh in my EKS cluster, I have to get rid of NGINX Ingress, services cannot talk each other if there are more than 25 interactions, I have to manually set the backends (that are part of the mesh) within each virtual node... this is absolutely ridiculous.

dastbe · 2021-04-02T20:26:43Z

Hey @fblgit, wanted to respond to a few of your questions and comments.

It cant detect by itself who is in the mesh already,

Have you taken a look at https://github.com/aws/aws-app-mesh-controller-for-k8s ? Its the way we recommend customers who are using kubernetes integrate with App Mesh, and is meant to be more idiomatic. And if its not satisfying your needs, we'd like to hear about it.

by default no traffic goes thru the mesh if its not specified in the backends, it doesn't allow wildcards..

So this is a conscious decision in App Mesh to require fully modeling connectivity so that there's an explicit definition of who can talk to what. We do recognize that explicitly modeling each service for each node is redundant, and we have roadmap items like #113 to solve it.

The backends is limited to 25

Our current limit here is actually 50 and we can increase this for you if you cut us a service quota increase via the console. However, there is a limit to the amount of individual services we want people modeling on a single node and are looking to issues like #113 to solve that.

I have to get rid of NGINX Ingress

What constraints are forcing this for you? We have customers who are able to use NGINX with App Mesh, as well as others who have migrated to use our Virtual Gateways.

fblgit · 2021-04-04T15:15:06Z

Thanks for your answer.
1 - The controller is ok, I guess a must .. otherwise without CRD's on K8s.. expecting any change to happen on the UI of AWS is impossible.
2 - You make it look like an advantage or a feature.. it is not. It's a limitation, probably from its design... otherwise there will be a flag to alter this behavior. Suggestion, mesh modes: DEFAULT_MESH/DEFAULT_NONMESH. #113 been there for 18months+ not even in the current roadmap scope.
3 - If you had set 50 as a limit, there must be a reason.. like any other limit. Crossing that line may bring problems caused by design flaws. Overall this doesn't make sense.. backend limits + explicit modeling.. wonder at design stage how do you justify this.
4 - Ur customers have less than 50 backends. Those with more, have migrated to Virtual Gateways.. which doesn't satisfy West-End scenarios anyways.

The main problem from my perspective is... either all or this is worthless.. things like canary in a microservices architecture cannot be accomplished.

I think that there is some basic misunderstanding of how products/offerings are.. Its not about the industry adapting to AppMesh.. It's about AppMesh adapting to the industry. That starts with understanding that K8s workload exists much before.. and how they can adopt this with less effort. Expecting Modeling+RevampIngress is very pompous in balance of what AppMesh brings to the table.

kiranmeduri · 2021-04-12T19:55:08Z

Hi @fblgit, App Mesh leans towards explicit modeling to limit blast radius of changes happening inside a mesh. For e.g. a new service joining a mesh should not impact every other service already in the mesh. While we do not have data on when and how Envoy fails as configuration grows, we did want to lean on safety rather than flexibility. I understand we may have created user experience issues due to this friction. Issue #113 is an effort to address this but unfortunately we could not prioritize it ahead of other things.

To summarize, the asks here are;

Allow fully-connected mesh setup where every service in the Kubernetes cluster can talk to every other service. Publish metrics and failure paths if any in adopting this approach.
Allow fully-connected mesh setup within a Kubernetes namespace Allow wildcard entries in VirtualNode backends #60.
Allow defining coarser dependencies using backend groups Feature Request: Backend Groups #113.
Increase limits on number of backends to support entire cluster. Similar to (1) above but with explicit modeling.

Appreciate you taking time in testing App Mesh and documenting the observations. We will take this feedback and look into addressing these.

dastbe assigned shubharao and kiranmeduri Oct 31, 2019

dastbe added the Roadmap: Awaiting Customer Feedback We need to get more information in order understand how we will implement this feature. label Oct 31, 2019

kiranmeduri assigned M00nF1sh and unassigned shubharao and kiranmeduri Feb 13, 2020

kiranmeduri added Roadmap: Proposed We are considering this for inclusion in the roadmap. and removed Roadmap: Proposed We are considering this for inclusion in the roadmap. labels Feb 13, 2020

kiranmeduri assigned herrhound and unassigned M00nF1sh Apr 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: EKS service discovery #125

Feature Request: EKS service discovery #125

omerlh commented Oct 28, 2019

stefanprodan commented Oct 28, 2019

kiranmeduri commented Dec 20, 2019

fblgit commented Apr 2, 2021 •

edited

Loading

dastbe commented Apr 2, 2021

fblgit commented Apr 4, 2021

kiranmeduri commented Apr 12, 2021

Feature Request: EKS service discovery #125

Feature Request: EKS service discovery #125

Comments

omerlh commented Oct 28, 2019

stefanprodan commented Oct 28, 2019

kiranmeduri commented Dec 20, 2019

fblgit commented Apr 2, 2021 • edited Loading

dastbe commented Apr 2, 2021

fblgit commented Apr 4, 2021

kiranmeduri commented Apr 12, 2021

fblgit commented Apr 2, 2021 •

edited

Loading