Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named port environment variables not being created #165

Closed
Strayfe opened this issue Feb 13, 2023 · 13 comments
Closed

Named port environment variables not being created #165

Strayfe opened this issue Feb 13, 2023 · 13 comments
Labels
bug Something isn't working

Comments

@Strayfe
Copy link
Contributor

Strayfe commented Feb 13, 2023

I am raising this as a new issue as I am not sure I am going to get a response to #64

There seems to be a few discrepancies between the way that Kubernetes and Bridge allocate ports from a service that contains multiple named ports.

There are environment variables created by Kubernetes that reference ports by the name metadata, if it exists, instead of by port which are allocated arbitrarily when using Bridge as the OP described.

Say your microservice has a dependency on a service you have no control over, let's use the bitnami RabbitMQ helm chart as an example, which exposes the following service (trimmed for brevity):

apiVersion: v1
kind: Service
metadata:
  name: rabbitmq
  namespace: rabbitmq
spec:
  ports:
  - name: amqp
    port: 5672
    protocol: TCP
    targetPort: amqp
  - name: epmd
    port: 4369
    protocol: TCP
    targetPort: epmd
  - name: dist
    port: 25672
    protocol: TCP
    targetPort: dist
  - name: http-stats
    port: 15672
    protocol: TCP
    targetPort: stats

This service has 4 ports described, all with name metadata so the ports can be referenced by name in other resources like Ingresses, etc.

An example of this would look like:

apiVersion: networking.k8s.io/v1
kind: Ingress
...
spec:
  rules:
  - host: rabbitmq.local
    http:
      paths:
      - backend:
          service:
            name: rabbitmq
            port:
              name: http-stats        <-- port referenced by name, not port number

Now, when we look at one of our RabbitMQ pods in a Kubernetes cluster, we can see environment variables that have been created to reflect each of these ports by name (IP addresses obfuscated to 0.0.0.0):

<<K9s-Shell>> Pod: rabbitmq/rabbitmq-0 | Container: rabbitmq 
I have no name!@rabbitmq-0:/$ printenv | grep RABBITMQ_
RABBITMQ_PORT_5672_TCP_ADDR=0.0.0.0
RABBITMQ_PORT_15672_TCP_ADDR=0.0.0.0
RABBITMQ_PORT_5672_TCP_PORT=5672              <-- (c)
RABBITMQ_PORT_4369_TCP_PORT=4369
RABBITMQ_SERVICE_PORT_AMQP=5672               <-- (d)
RABBITMQ_PORT_25672_TCP_ADDR=0.0.0.0
RABBITMQ_PORT_5672_TCP_PROTO=tcp
RABBITMQ_PORT_4369_TCP=tcp://0.0.0.0:4369
RABBITMQ_PORT_25672_TCP_PORT=25672
RABBITMQ_SERVICE_PORT=5672                    <-- (b)
RABBITMQ_SERVICE_PORT_DIST=25672
RABBITMQ_PORT_25672_TCP_PROTO=tcp
RABBITMQ_PORT_4369_TCP_ADDR=0.0.0.0
RABBITMQ_SERVICE_HOST=0.0.0.0                 <--- (a)
RABBITMQ_PORT=tcp://0.0.0.0:5672
RABBITMQ_SERVICE_PORT_HTTP_STATS=15672
RABBITMQ_PORT_15672_TCP_PROTO=tcp
RABBITMQ_PORT_25672_TCP=tcp://0.0.0.0:25672
RABBITMQ_SERVICE_PORT_EPMD=4369
RABBITMQ_PORT_5672_TCP=tcp://0.0.0.0:5672
RABBITMQ_PORT_15672_TCP_PORT=15672
RABBITMQ_PORT_15672_TCP=tcp://0.0.0.0:15672
RABBITMQ_NODE_NAME=rabbit@rabbitmq-0.rabbitmq-headless.rabbitmq.svc.cluster.local
RABBITMQ_PORT_4369_TCP_PROTO=tcp

So, if I wanted to reference my RabbitMQ instance using these environment variables I would take (a) and ideally (b) in my code and allow Bridge to update the variables accordingly:

image

However, we run into the issue where only the last registered port (in order) will be allocated to the RABBITMQ_SERVICE_PORT variable as seen below:

image

In the above image you can see:
(1) - 4x ports mapped, same as in the service but randomly allocated ports on my dev machine
(2) - the last registered port has been allocated to the RABBITMQ_SERVICE_PORT variable

There are a few issues here:
As you can see from (b), Kubernetes has mapped the RABBITMQ_SERVICE_PORT correctly to the AMQP port (first in the collection) whereas Bridge has allocated the last port in the collection to the variable which is not the AMQP port so I get connection errors in code.

Also, as there is no way to pre-determine what port Bridge is going to allocate to (c) when it creates its proxy ports and thus the names of the variables, we have no way to dynamically add these to our code, one moment it could be port 55070, the next run it could be 55011, etc.

Now, you might have noticed at this point that there is a discrepancy highlighted by the existence of (d) which is a port variable declared in Kubernetes using the name metadata instead of a port which would be easier to track in code as we could just request the RABBITMQ_SERVICE_PORT_AMQP port and then no matter what port Bridge chose to use for its proxies, we could track it.

I am not sure why Bridge does not fetch this variable from the cluster but could it be looked into?

Originally posted by @Strayfe in #64 (comment)

@hsubramanianaks hsubramanianaks added the investigation needed Investigation is needed before a proper triage can be done label Feb 21, 2023
@qpetraroia
Copy link
Collaborator

Hi @Strayfe,

Thanks for the comment, we will try and look into this!

@elenavillamil
Copy link
Contributor

@Strayfe thank you for the great issue and details! To summarize, I think I get two actions items from this, please correct me if they are wrong.

  1. Bridge to preserve/replicate the SERVICE_PORT_ variable.
  2. The SERVICE_PORT variable should map to first entry in config not last.

We will add this to our planning. Just keep in mind we are a small team. As always we also encourage community to contribute :)

@Strayfe
Copy link
Contributor Author

Strayfe commented Feb 21, 2023

@elenavillamil that is correct, providing that point 1. SERVICE_PORT_ would have {PORT_NAME} appended then developers who run without admin privileges or remote sessions would be able to use dependencies out of their control.

As per your note regarding community contributions, I will try to assist where I can!

Looking to see the Kubernetes Go code to see what I might be able to transpose into dotnet, we can see the logic flow for how the first port in a service is allocated to the serviceName_SERVICE_PORT variable for backwards compatibility and then every port is also allocated a named variable afterwards:

image

I have located the code where we will need to update the collection of variables being created but in order to create named ports we will require more context to be passed to this method as the name is not currently available to it.

image

I will try to investigate further to see how this information is being passed around currently but as I am not au fait with this code-base don't expect much 😅 .

@Strayfe
Copy link
Contributor Author

Strayfe commented Feb 22, 2023

Ha, there is actually a comment already to this effect in the code 😛 , it looks like the PortPair class is missing a property to support named ports so that will need adding.

image

I am going to sleep now but I will leave you with my current thoughts with a few todo comments still to work out, though this first draft would fix the 2nd point as long as the collection was being allocated in the correct order already:

public IDictionary<string, string> CreateEnvVariablesForK8s(WorkloadInfo workloadInfo)
{
    var result = new Dictionary<string, string>(workloadInfo.EnvironmentVariables);

    foreach (var endpoint in workloadInfo.ReachableEndpoints)
    {
        if (string.Equals(endpoint.DnsName, DAPR, StringComparison.OrdinalIgnoreCase))
        {
            // Override the DAPR env variables with the real local ports (that might be different if we neeeded to re-allocate them)
            result["DAPR_HTTP_PORT"] = endpoint.Ports[0].LocalPort.ToString(); // TODO (lolodi): this is a hack that relies on the HTTP port to always be the first and GRPC port the second.
            result["DAPR_GRPC_PORT"] = endpoint.Ports[1].LocalPort.ToString(); // We should probably name the port pairs (maybe with the env variable that we want to set with them).
        }                                                                      // Once we do that, we can actually stop assigning the DAPR dns name ot this endpoint and just leave it empty, consistently with how the Remote agent works

        // because we are using dns name instead of service we have to retrieve it by splitting when needed
        // If this ever cause issues we should consider larger refactor where we add serviceName member variable to EndpointInfo class.
        var serviceName = endpoint.DnsName
            .ToUpperInvariant()
            .Split(".")
            .First()
            .Replace("-", "_")
            .Replace(".", "_");

        var host = _useKubernetesServiceEnvironmentVariables || string.Equals(endpoint.DnsName, DAPR, StringComparison.OrdinalIgnoreCase)
            ? endpoint.LocalIP.ToString()
            : endpoint.DnsName;

        // todo: check that the ports in this collection is allocated in the correct order and immutable
        var unnamedPort = _useKubernetesServiceEnvironmentVariables || string.Equals(endpoint.DnsName, DAPR, StringComparison.OrdinalIgnoreCase)
            ? endpoint.Ports[0].LocalPort
            : endpoint.Ports[0].RemotePort;

        if (string.Equals(serviceName, "KUBERNETES", StringComparison.OrdinalIgnoreCase))
            host = _kubernetesClient.HostName;

        // Service Host
        result[$"{serviceName}_SERVICE_HOST"] = host;
        // Service Port: allocate the first port in the service to the backwards-compatible environment variable in keeping with Kubernetes source code
        result[$"{serviceName}_SERVICE_PORT"] = unnamedPort.ToString();

        // All named ports (only the first may be unnamed according to Kubernetes source code)
        foreach (var portPair in endpoint.Ports)
        {
            var port = _useKubernetesServiceEnvironmentVariables || string.Equals(endpoint.DnsName, DAPR, StringComparison.OrdinalIgnoreCase)
                ? portPair.LocalPort
                : portPair.RemotePort;

            // todo: work out named ports, will need to extend the PortPair class to include Name property and populate it accordingly
            var namedPort = string.Empty;

            var protocolUpper = portPair.Protocol.ToUpperInvariant();

            result[$"{serviceName}_PORT"] = $"{portPair.Protocol}://{host}:{port}";
            result[$"{serviceName}_PORT_{port}_{protocolUpper}_PROTO"] = portPair.Protocol;
            result[$"{serviceName}_PORT_{port}_{protocolUpper}"] = $"{portPair.Protocol}://{host}:{port}";
            result[$"{serviceName}_PORT_{port}_{protocolUpper}_PORT"] = port.ToString();
            result[$"{serviceName}_PORT_{port}_{protocolUpper}_ADDR"] = host;
            result[$"{serviceName}_SERVICE_PORT_{protocolUpper}"] = port.ToString();

            // new named port
            result[$"{serviceName}_SERVICE_PORT_{namedPort}"] = port.ToString();

            // if this is managed identity with useKubernetesServiceEnvironmentVariables set to true we have to update ms endpoint variable from dns name to ip:port
            if (_useKubernetesServiceEnvironmentVariables && string.Equals(serviceName, ManagedIdentity.TargetServiceNameOnLocalMachine, StringComparison.OrdinalIgnoreCase))
                result[ManagedIdentity.MSI_ENDPOINT_EnvironmentVariable] = $"http://{host}:{port}/metadata/identity/oauth2/token";
        }
    }

    return result;
}

P.S. I have moved some variables about because I did not see a reason for the same variables to be evaluated multiple times in the inner loop when it was not subject to change as part of the outer loop. This new method does breach cognitive complexity but I will see what we can tidy up after I work out point 1.

@elenavillamil
Copy link
Contributor

Release for this issue should go out next week. Thanks @Strayfe for your contribution!

@Strayfe
Copy link
Contributor Author

Strayfe commented Mar 3, 2023

Thank you, that is amazing news!

@elenavillamil elenavillamil added bug Something isn't working inprogress and removed investigation needed Investigation is needed before a proper triage can be done labels Mar 9, 2023
@elenavillamil
Copy link
Contributor

Apologies, we run into small hipcup on release this week. New ETA for this to be released is next Wednesday.

@Strayfe
Copy link
Contributor Author

Strayfe commented Mar 9, 2023

Thanks for keeping me in the loop!

@garyburgmann
Copy link
Contributor

@Strayfe we have a NodeJS and PHP util lib we share for our services. I was able to successfully mitigate this by offsetting from _SERVICE_PORT, which appears to always be the last defined in the list. Local ports appear to be sequentially defined.

Where we have 3 ports exposed by the service:

            ports:
                - appProtocol: amqp
                    name: amqp
                    port: 5672
                    protocol: TCP
                    targetPort: 5672
                - appProtocol: http
                    name: management
                    port: 15672
                    protocol: TCP
                    targetPort: 15672
                - appProtocol: prometheus.io/metrics
                    name: prometheus
                    port: 15692
                    protocol: TCP
                    targetPort: 15692

Example from PHP

$queuePort = getenv('K8S_QA_RABBITMQ_SERVICE_PORT');
$queuePortOffset = ((int) $queuePort) - 2;

@Strayfe
Copy link
Contributor Author

Strayfe commented Mar 14, 2023

@garyburgmann thanks for the information.

Just in case you may not have noticed I have raised a Pull Request, which has been merged, to update Bridge to Kubernetes to mirror the Kubernetes behaviour which is to use the first port in the service as the _SERVICE_PORT and it will now also create the named variables which makes it easier to reference unambiguous ports.

You can see the changes here: #173

Please keep in mind that as you maintain a library that offsets the port, you will likely need to update your code when this change goes live (hopefully tomorrow according to elenavillamil's recent comments).

For clarity, these ports should now be available as per your example:

Re-ordered to first in the list as per Kubernetes source code
K8S_QA_RABBITMQ_SERVICE_PORT: 5672

Named port variables to allow unambiguous port referencing
K8S_QA_RABBITMQ_SERVICE_PORT_AMQP: 5672
K8S_QA_RABBITMQ_SERVICE_PORT_MANAGEMENT: 15672
K8S_QA_RABBITMQ_SERVICE_PORT_PROMETHEUS: 15692

@garyburgmann
Copy link
Contributor

Yep, saw the comments above re: the delayed release and just thought I'd share my 2 cents.

Thanks a lot for this very handy change @Strayfe 🤘🏻

@garyburgmann
Copy link
Contributor

running this with CLI w/ success

$ cat /tmp/60add30d.sh | grep K8S_QA_RABBITMQ_SERVICE_PORT_
export K8S_QA_RABBITMQ_SERVICE_PORT_AMQP=55216
export K8S_QA_RABBITMQ_SERVICE_PORT_MANAGEMENT=55217
export K8S_QA_RABBITMQ_SERVICE_PORT_PROMETHEUS=55218
$ cat /tmp/60add30d.sh | grep PORTAL_SERVICE_PORT_
export PORTAL_SERVICE_PORT_HTTP=55102
export PORTAL_SERVICE_PORT_GRPC=55103

@elenavillamil
Copy link
Contributor

Hi everyone, just wanted to share quick update here. Fix for this that @Strayfe worked on got deployed today. Please feel free to re-open if you still the issue after upgrading to latest version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

No branches or pull requests

5 participants