Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enter drain mode when a pod is terminating and sticky-sessions are enabled #267

Closed
dcowden opened this issue Apr 9, 2018 · 21 comments
Closed
Labels
enhancement Pull requests for new features/feature enhancements proposal An issue that proposes a feature request

Comments

@dcowden
Copy link

dcowden commented Apr 9, 2018

We have a legacy application ( tomcat/java), which needs sticky sessions. When we deploy new versions of our applications, we need to stop sending new connections to a server, while sending bound sessions to the old server. Please note: this is not referring to in-flight requests, we're needing the active tomcat sessions to expire, which normally takes a few hours.

This is possible using nginx drain command. This will send bound connections to the old server, but send new ones elsewhere. But in kubernetes, calling a command on the ingress controller is not part of the deployment flow. To do it with current tools, we would need to add a preStop hook to our application. In that hook, we'd need to access the ingress controller, and ask it to drain with an api call. We'd rather not introduce the ability for applications to call apis on the ingress controller.

When kubernetes terminates a pod, it enters the TERMINATING status. In nearly all cases, when sticky sessions are enabled, the desired functionality is probably to put the associated pod into drain mode. Is this possible with the nginx-plus ingress controller?

We currently use the kubernetes-maintained nginx ingress controller. This feature would make it worth the money to use nginx-plus

Aha! Link: https://nginx.aha.io/features/IC-110

@pleshakov
Copy link
Contributor

@dcowden Maybe the following approach can work you?

If you want to drain particular pods, you change the corresponding Ingress resource by adding an annotation that specifies which pods to drain using a label query. For example:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: cafe-ingress
  annotations:
     kubernetes.io/ingress.class: "nginx"
     nginx.com/drain: "version=0.1"
spec:
  rules:
  - host: "cafe.example.com"
    http:
      paths:
      - path: /tea
        backend:
          serviceName: tea-svc
          servicePort: 80

In this case, the Ingress controller will drain all the pods corresponding to the tea-svc with the label version=0.1. This will allow you to specify which pods to drain during an application upgrade.

Please not that this is not available, but we can add it.

@dcowden
Copy link
Author

dcowden commented Apr 10, 2018

Hmm, that's an interesting approach, but I don't think it would work well for us.

Today we use fairly conventional deployments, in which the deployment controller scales pods up and down. Under the hood it does this with replicaSets i think. We do not re-publish our ingresses as a part of deployments, and this approach would require doing that.

Given our current flow, it would be much more seamless if a pod in TERMINATING status was automatically drained. This would cover several situations:

  • A deployment, in which case a pod is terminating on purpose for shutdown
  • A node drain, in which case we're actually deleting a pod so we can maintenance a node
  • An OOM kill, when k8s is terminating a pod beacuse it is out of memory.

In reality, if you are running sticky sessions, i can't think of any cases where you wouldnt want to drain a pod when it is terminating rather than immediately removing it from service.

In practice there is still other work needed to make it work, because kubernetes has to have a way to know when its ok to actually kill the pod. This is accomplished by registering a preStop hook, which runs and waits for all of the active sessions to be gone. If the hook finishes, or the pod kill grace period expires, kubernetes kills the pod, which will make it fail the health checks and it will be removed from nginx

@pleshakov
Copy link
Contributor

@dcowden thanks for providing more details.

It looks like it is possible to accomplish session draining through the Ingress controller.

Unfortunately, once a pod enters the terminating state, its endpoint is removed from Kubernetes, which makes the Ingress controller remove that endpoint from NGINX configuration. Thus, in order to retain that endpoint in NGINX configuration, when the endpoint is being removed, we must make an additional query to Kubernetes API to check if the corresponding pod is in the terminating state. If it is the case, we need to drain it, instead of removing it. Also, once the pod is successfully removed, we need to make sure that it is also removed from the NGINX configuration.

Do you think the logic above will cover your use case?

I can prepare a PR which implements that logic and share it with you, if you'd like to test it.

@dcowden
Copy link
Author

dcowden commented Apr 10, 2018

Yes, i think that's the logic... at least as near as I can tell without actually implementing it. I'd be happy to test it. I'm also open to alternate ways of working if it can accomplish the objective with less work.

As a side note, this use case once again validates the decision NOT to use the k8s service abstraction, because its pretty clear that the endpoint would become inaccssible. IIRC, there's a 'use service=true' flag, which would be incompatible with using this functionality.

@isaachawley isaachawley added the enhancement Pull requests for new features/feature enhancements label Jul 30, 2018
@victor-frag
Copy link

Hello,

i am facing the same scenario as @dcowden, with a java app using tomcat with sticky sessions. Do we have this implemented already?

@tkgregory
Copy link

I also have this requirement for Tomcat instances that require session affinity. During deployment existing bound sessions should still be routed through the same instances, with new sessions being routed to the new instances.

Could we get an update on this please?

@dcowden
Copy link
Author

dcowden commented Feb 18, 2019

@tkgregory @victor-frag we are using https://github.com/jcmoraisjr/haproxy-ingress, which implements this functionality. We've been using it in production for a while-- its been stable, well supported, and actively updated.

@irizzant
Copy link

irizzant commented Apr 18, 2019

I'd like to understand this as well.
We do have a JBoss AS hosting our web application, which has exactly the same problems.
We switched to haproxy and we currently handle rolling updates just fine.

As @pleshakov suggested It should be possible though using the same approach taken for haproxy:

Thus, in order to retain that endpoint in NGINX configuration, when the endpoint is being removed, we must make an additional query to Kubernetes API to check if the corresponding pod is in the terminating state. If it is the case, we need to drain it, instead of removing it. Also, once the pod is successfully removed, we need to make sure that it is also removed from the NGINX configuration.

I'd also add that the above should happen only if session affinity is enabled, and there should be no need for an additional query to Kubernetes API since ingress controllers should be automatically notified when pods enter the termination phase.

@amodolo
Copy link

amodolo commented Jun 26, 2019

@tkgregory @victor-frag we are using https://github.com/jcmoraisjr/haproxy-ingress, which implements this functionality. We've been using it in production for a while-- its been stable, well supported, and actively updated.

@dcowden, i'm figuring out how to use haproxy to mantains alive the application untin the sessions termination. But at the moment, when i deploy a new application's version, the old pods are terminated and the new ones are started, no matter if there are active sessions.
Can you explain me how you have solve this? What ingress/haproxy configuration have you used?

Thx

@dcowden
Copy link
Author

dcowden commented Jun 26, 2019

Hi @amodolo,
We set up our tomcat container with a pre stop hook that doesn't return until there are no active sessions left, or until a timeout that's long enough we are comfortable the session isn't a real user.

In our case, we wrote a small servlet we deployed with the app to return the number of sessions left using jmx.. There are other ways for sure, but it works ok for us until we can do better

@amodolo
Copy link

amodolo commented Jun 26, 2019

I've just implements your solution and seams to works like a charm.

Thx a lot (also for the super fast response 😄)

@dcowden
Copy link
Author

dcowden commented Jun 26, 2019

@amodolo glad it worked for you! FWIW, we have been using this solution in production for about a year now. We run a 24x7 platform-- but humans do not work 24x7. We simply wait for sessions to die, or 12 hours, whichever comes first. When we execute a build, we'll have extra pods out there serving the old workloads for 1/2 day till they die. It works pretty well.

the main negative ( and why its not THE solution) is that it limits your iteration velocity in production on new code to once a day, which is a bit of a limitation.

@amodolo
Copy link

amodolo commented Jun 28, 2019

The main negative aspect of this solution is this: suppose you have one server with 2 active sessions and you are rolling up a new application version. The old POD will enter in the drain mode until the sessions die (or the greace period over). Suppose also that the sticky session is based on the cookie generated by HAProxy. In this configuration, if one of the two users logs out from the application, the haproxy's cookie is not removed until the user close the browser (because is a session cookie); so if that user logs out and the logs in again (without close the browser), it will be balanced to the same old POD.
One better approach could be to configure the ingress to use the JSESSIONID cookie generated by the server. In this case, if your application removes the session cookie on logout, the user will be immediatelly balanced to one of the new POD after the logout.
I hope that i can explain myself. What do you think?

@dcowden
Copy link
Author

dcowden commented Jun 28, 2019

Yes, we use haproxy in rewrite cookie mode, and use a separate cookie. I think using jsessionid would work too. Another requirement is that when a particular user on an old pod logs out and logs back in, we want to be guaranteed that they switch to a new pod. That ends up being important sometimes

@miclefebvre
Copy link

@dcowden @amodolo

We set up our tomcat container with a pre stop hook that doesn't return until there are no active sessions left, or until a timeout that's long enough we are comfortable the session isn't a real user.

In our case, we wrote a small servlet we deployed with the app to return the number of sessions left using jmx.. There are other ways for sure, but it works ok for us until we can do better

Do you mind explaining more how you wrote this pre stop hook ? If I do an exec, I would have to have a script in the same docker image than my tomcat, ortherwise if I do a http call that doesn't return the call will timeout.

How did you do it, you added a script in your tomcat container and it calls the tomcat ? Could it be done with another container in the pod ? Thanks

@dcowden
Copy link
Author

dcowden commented Mar 4, 2020

Hi @miclefebvre

How did you do it, you added a script in your tomcat container and it calls the tomcat ?

Yes, our script is in the same container as tomcat, and we hook a drain script to a pre-stop. This script calls a URL provided by tomcat that responds with the number of user sessions remaining. When there are no more sessions, or when we have reached our timeout, we finish draining.

Here's the important bit of our drain.sh script:

# notional logic:
# wait for ptplace to terminate as long as:
#   the drain endpoint returns a 2XX within 10 seconds AND
#   the drain response contains the word "DRAINING"
# return 0 if we terminated due to a 2xx resonse that DIDNT include draining
# otherwise, return 1 ( we timed out waiting to drain )

while [ 1 ]
do
    debug_msg "Checking ${DRAIN_URL}, timeout=${DRAIN_URL_TIMEOUT_SECS}s"
    echo "" > $RESULT_FILENAME
    curl -s -f --retry 2  --max-time $DRAIN_URL_TIMEOUT_SECS -o $RESULT_FILENAME --no-buffer "${DRAIN_URL}" 
    
    STATUS_RESULT=$?
    debug_msg "curl result code: $STATUS_RESULT"

    if [ $STATUS_RESULT -ne 0 ]; then
      info_msg "Drain returned non 2xx. Terminating"      
      exit 1
    fi

    info_msg "Received Result::"
    cat $RESULT_FILENAME

    grep -i -c $STILL_DRAINING $RESULT_FILENAME
    if [ $? -ne  0 ]; then
        info_msg "Draining Complete. Terminating."
        exit 0        
    else
        debug_msg "Still Draining. Waiting $DRAIN_URL_INTERVAL_SECS seconds.."
        sleep $DRAIN_URL_INTERVAL_SECS
    fi    

done

It's worth noting that we terminate if we receive a non 2XX from tomcat-- that's there incase tomcat has become unresponsive while we're draining, which happened once in production. If Tomcat's already hosed, then the user sessions there don't matter. That might seem unlikely, but in our case sessions last a LONG time ( our users are typically active for nearly an entire business day )

Could it be done with another container in the pod ? Thanks

Our pod has only one container, but I suppose it would work with multiple containers, because it's based on making an HTTP call into tomcat.

@miclefebvre
Copy link

Thanks a lot @dcowden,

I will give it a try. But we are using jib has base image. I'm not sure if there's curl or anything like that. I'll see if we can do this in another container or if I should change my base image.

@ogarrett ogarrett added the proposal An issue that proposes a feature request label Mar 11, 2021
@deepakkhetwal
Copy link

Hi @dcowden , would you like to share your sticky session configuration for HAproxy ingress?

@github-actions
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Pull requests/issues with no activity label May 29, 2021
@pleshakov pleshakov removed the stale Pull requests/issues with no activity label Jun 1, 2021
@alessandroargentieri
Copy link

hello, anyone knows if this feature has been added in any NGINX ingress controller implementation, like HAProxy does?

@brianehlert
Copy link
Collaborator

I think this is valuable to keep around as a general purpose behavior and not have any dependency on sticky-sessions, since how the back-end/upstream pod shuts down is up to the application developer or operator and the ingress controller should simply behave consistently no matter if the upstream takes 2 minutes or 2 hours or 2 days to bleed off.

I believe that current behavior would be to remove the upstream when not in the ready state. Which is different than the drain behavior of NGINX.

To update this with the current state of the API we would need to set an upstream to drain when the pod state is terminating according to EndpointSlices
https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#conditions

This way any pre-stop hooks or other flow can be executed as outlined here: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination

https://nginx.org/en/docs/http/ngx_http_upstream_module.html#server
Note: this is not available for stream

@nginxinc nginxinc locked and limited conversation to collaborators Jul 4, 2024
@brianehlert brianehlert converted this issue into discussion #5962 Jul 4, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
enhancement Pull requests for new features/feature enhancements proposal An issue that proposes a feature request
Projects
Archived in project
Development

No branches or pull requests