Enter drain mode when a pod is terminating and sticky-sessions are enabled #267

dcowden · 2018-04-09T18:58:04Z

We have a legacy application ( tomcat/java), which needs sticky sessions. When we deploy new versions of our applications, we need to stop sending new connections to a server, while sending bound sessions to the old server. Please note: this is not referring to in-flight requests, we're needing the active tomcat sessions to expire, which normally takes a few hours.

This is possible using nginx drain command. This will send bound connections to the old server, but send new ones elsewhere. But in kubernetes, calling a command on the ingress controller is not part of the deployment flow. To do it with current tools, we would need to add a preStop hook to our application. In that hook, we'd need to access the ingress controller, and ask it to drain with an api call. We'd rather not introduce the ability for applications to call apis on the ingress controller.

When kubernetes terminates a pod, it enters the TERMINATING status. In nearly all cases, when sticky sessions are enabled, the desired functionality is probably to put the associated pod into drain mode. Is this possible with the nginx-plus ingress controller?

We currently use the kubernetes-maintained nginx ingress controller. This feature would make it worth the money to use nginx-plus

Aha! Link: https://nginx.aha.io/features/IC-110

pleshakov · 2018-04-10T13:38:24Z

@dcowden Maybe the following approach can work you?

If you want to drain particular pods, you change the corresponding Ingress resource by adding an annotation that specifies which pods to drain using a label query. For example:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: cafe-ingress
  annotations:
     kubernetes.io/ingress.class: "nginx"
     nginx.com/drain: "version=0.1"
spec:
  rules:
  - host: "cafe.example.com"
    http:
      paths:
      - path: /tea
        backend:
          serviceName: tea-svc
          servicePort: 80

In this case, the Ingress controller will drain all the pods corresponding to the tea-svc with the label version=0.1. This will allow you to specify which pods to drain during an application upgrade.

Please not that this is not available, but we can add it.

dcowden · 2018-04-10T13:53:19Z

Hmm, that's an interesting approach, but I don't think it would work well for us.

Today we use fairly conventional deployments, in which the deployment controller scales pods up and down. Under the hood it does this with replicaSets i think. We do not re-publish our ingresses as a part of deployments, and this approach would require doing that.

Given our current flow, it would be much more seamless if a pod in TERMINATING status was automatically drained. This would cover several situations:

A deployment, in which case a pod is terminating on purpose for shutdown
A node drain, in which case we're actually deleting a pod so we can maintenance a node
An OOM kill, when k8s is terminating a pod beacuse it is out of memory.

In reality, if you are running sticky sessions, i can't think of any cases where you wouldnt want to drain a pod when it is terminating rather than immediately removing it from service.

In practice there is still other work needed to make it work, because kubernetes has to have a way to know when its ok to actually kill the pod. This is accomplished by registering a preStop hook, which runs and waits for all of the active sessions to be gone. If the hook finishes, or the pod kill grace period expires, kubernetes kills the pod, which will make it fail the health checks and it will be removed from nginx

pleshakov · 2018-04-10T15:29:54Z

@dcowden thanks for providing more details.

It looks like it is possible to accomplish session draining through the Ingress controller.

Unfortunately, once a pod enters the terminating state, its endpoint is removed from Kubernetes, which makes the Ingress controller remove that endpoint from NGINX configuration. Thus, in order to retain that endpoint in NGINX configuration, when the endpoint is being removed, we must make an additional query to Kubernetes API to check if the corresponding pod is in the terminating state. If it is the case, we need to drain it, instead of removing it. Also, once the pod is successfully removed, we need to make sure that it is also removed from the NGINX configuration.

Do you think the logic above will cover your use case?

I can prepare a PR which implements that logic and share it with you, if you'd like to test it.

dcowden · 2018-04-10T19:49:40Z

Yes, i think that's the logic... at least as near as I can tell without actually implementing it. I'd be happy to test it. I'm also open to alternate ways of working if it can accomplish the objective with less work.

As a side note, this use case once again validates the decision NOT to use the k8s service abstraction, because its pretty clear that the endpoint would become inaccssible. IIRC, there's a 'use service=true' flag, which would be incompatible with using this functionality.

victor-frag · 2019-02-05T14:06:19Z

Hello,

i am facing the same scenario as @dcowden, with a java app using tomcat with sticky sessions. Do we have this implemented already?

tkgregory · 2019-02-18T11:44:48Z

I also have this requirement for Tomcat instances that require session affinity. During deployment existing bound sessions should still be routed through the same instances, with new sessions being routed to the new instances.

Could we get an update on this please?

dcowden · 2019-02-18T12:18:25Z

@tkgregory @victor-frag we are using https://github.com/jcmoraisjr/haproxy-ingress, which implements this functionality. We've been using it in production for a while-- its been stable, well supported, and actively updated.

irizzant · 2019-04-18T11:44:16Z

I'd like to understand this as well.
We do have a JBoss AS hosting our web application, which has exactly the same problems.
We switched to haproxy and we currently handle rolling updates just fine.

As @pleshakov suggested It should be possible though using the same approach taken for haproxy:

Thus, in order to retain that endpoint in NGINX configuration, when the endpoint is being removed, we must make an additional query to Kubernetes API to check if the corresponding pod is in the terminating state. If it is the case, we need to drain it, instead of removing it. Also, once the pod is successfully removed, we need to make sure that it is also removed from the NGINX configuration.

I'd also add that the above should happen only if session affinity is enabled, and there should be no need for an additional query to Kubernetes API since ingress controllers should be automatically notified when pods enter the termination phase.

amodolo · 2019-06-26T08:58:13Z

@tkgregory @victor-frag we are using https://github.com/jcmoraisjr/haproxy-ingress, which implements this functionality. We've been using it in production for a while-- its been stable, well supported, and actively updated.

@dcowden, i'm figuring out how to use haproxy to mantains alive the application untin the sessions termination. But at the moment, when i deploy a new application's version, the old pods are terminated and the new ones are started, no matter if there are active sessions.
Can you explain me how you have solve this? What ingress/haproxy configuration have you used?

Thx

dcowden · 2019-06-26T11:04:01Z

Hi @amodolo,
We set up our tomcat container with a pre stop hook that doesn't return until there are no active sessions left, or until a timeout that's long enough we are comfortable the session isn't a real user.

In our case, we wrote a small servlet we deployed with the app to return the number of sessions left using jmx.. There are other ways for sure, but it works ok for us until we can do better

amodolo · 2019-06-26T12:26:11Z

I've just implements your solution and seams to works like a charm.

Thx a lot (also for the super fast response 😄)

dcowden · 2019-06-26T12:39:33Z

@amodolo glad it worked for you! FWIW, we have been using this solution in production for about a year now. We run a 24x7 platform-- but humans do not work 24x7. We simply wait for sessions to die, or 12 hours, whichever comes first. When we execute a build, we'll have extra pods out there serving the old workloads for 1/2 day till they die. It works pretty well.

the main negative ( and why its not THE solution) is that it limits your iteration velocity in production on new code to once a day, which is a bit of a limitation.

amodolo · 2019-06-28T07:07:27Z

The main negative aspect of this solution is this: suppose you have one server with 2 active sessions and you are rolling up a new application version. The old POD will enter in the drain mode until the sessions die (or the greace period over). Suppose also that the sticky session is based on the cookie generated by HAProxy. In this configuration, if one of the two users logs out from the application, the haproxy's cookie is not removed until the user close the browser (because is a session cookie); so if that user logs out and the logs in again (without close the browser), it will be balanced to the same old POD.
One better approach could be to configure the ingress to use the JSESSIONID cookie generated by the server. In this case, if your application removes the session cookie on logout, the user will be immediatelly balanced to one of the new POD after the logout.
I hope that i can explain myself. What do you think?

dcowden · 2019-06-28T11:36:31Z

Yes, we use haproxy in rewrite cookie mode, and use a separate cookie. I think using jsessionid would work too. Another requirement is that when a particular user on an old pod logs out and logs back in, we want to be guaranteed that they switch to a new pod. That ends up being important sometimes

miclefebvre · 2020-03-04T13:17:58Z

@dcowden @amodolo

We set up our tomcat container with a pre stop hook that doesn't return until there are no active sessions left, or until a timeout that's long enough we are comfortable the session isn't a real user.

In our case, we wrote a small servlet we deployed with the app to return the number of sessions left using jmx.. There are other ways for sure, but it works ok for us until we can do better

Do you mind explaining more how you wrote this pre stop hook ? If I do an exec, I would have to have a script in the same docker image than my tomcat, ortherwise if I do a http call that doesn't return the call will timeout.

How did you do it, you added a script in your tomcat container and it calls the tomcat ? Could it be done with another container in the pod ? Thanks

dcowden · 2020-03-04T14:21:20Z

Hi @miclefebvre

How did you do it, you added a script in your tomcat container and it calls the tomcat ?

Yes, our script is in the same container as tomcat, and we hook a drain script to a pre-stop. This script calls a URL provided by tomcat that responds with the number of user sessions remaining. When there are no more sessions, or when we have reached our timeout, we finish draining.

Here's the important bit of our drain.sh script:

# notional logic:
# wait for ptplace to terminate as long as:
#   the drain endpoint returns a 2XX within 10 seconds AND
#   the drain response contains the word "DRAINING"
# return 0 if we terminated due to a 2xx resonse that DIDNT include draining
# otherwise, return 1 ( we timed out waiting to drain )

while [ 1 ]
do
    debug_msg "Checking ${DRAIN_URL}, timeout=${DRAIN_URL_TIMEOUT_SECS}s"
    echo "" > $RESULT_FILENAME
    curl -s -f --retry 2  --max-time $DRAIN_URL_TIMEOUT_SECS -o $RESULT_FILENAME --no-buffer "${DRAIN_URL}" 
    
    STATUS_RESULT=$?
    debug_msg "curl result code: $STATUS_RESULT"

    if [ $STATUS_RESULT -ne 0 ]; then
      info_msg "Drain returned non 2xx. Terminating"      
      exit 1
    fi

    info_msg "Received Result::"
    cat $RESULT_FILENAME

    grep -i -c $STILL_DRAINING $RESULT_FILENAME
    if [ $? -ne  0 ]; then
        info_msg "Draining Complete. Terminating."
        exit 0        
    else
        debug_msg "Still Draining. Waiting $DRAIN_URL_INTERVAL_SECS seconds.."
        sleep $DRAIN_URL_INTERVAL_SECS
    fi    

done

It's worth noting that we terminate if we receive a non 2XX from tomcat-- that's there incase tomcat has become unresponsive while we're draining, which happened once in production. If Tomcat's already hosed, then the user sessions there don't matter. That might seem unlikely, but in our case sessions last a LONG time ( our users are typically active for nearly an entire business day )

Could it be done with another container in the pod ? Thanks

Our pod has only one container, but I suppose it would work with multiple containers, because it's based on making an HTTP call into tomcat.

miclefebvre · 2020-03-04T14:38:20Z

Thanks a lot @dcowden,

I will give it a try. But we are using jib has base image. I'm not sure if there's curl or anything like that. I'll see if we can do this in another container or if I should change my base image.

deepakkhetwal · 2021-03-28T21:42:43Z

Hi @dcowden , would you like to share your sticky session configuration for HAproxy ingress?

github-actions · 2021-05-29T02:15:20Z

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

alessandroargentieri · 2021-11-03T09:37:55Z

hello, anyone knows if this feature has been added in any NGINX ingress controller implementation, like HAProxy does?

brianehlert · 2024-01-02T19:04:47Z

I think this is valuable to keep around as a general purpose behavior and not have any dependency on sticky-sessions, since how the back-end/upstream pod shuts down is up to the application developer or operator and the ingress controller should simply behave consistently no matter if the upstream takes 2 minutes or 2 hours or 2 days to bleed off.

I believe that current behavior would be to remove the upstream when not in the ready state. Which is different than the drain behavior of NGINX.

To update this with the current state of the API we would need to set an upstream to drain when the pod state is terminating according to EndpointSlices
https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#conditions

This way any pre-stop hooks or other flow can be executed as outlined here: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination

https://nginx.org/en/docs/http/ngx_http_upstream_module.html#server
Note: this is not available for stream

dcowden mentioned this issue Apr 9, 2018

Initiate drain based on health check result kubernetes/ingress-nginx#2322

Closed

isaachawley added the enhancement Pull requests for new features/feature enhancements label Jul 30, 2018

gavinbunney mentioned this issue Apr 26, 2019

Support for draining node during pod termination emissary-ingress/emissary#1473

Closed

concaf mentioned this issue Jun 10, 2019

Unable to drain endpoints before removal envoyproxy/envoy#7218

Open

deepakkhetwal mentioned this issue Mar 5, 2021

Limit active application sessions to a pod/container #1395

Closed

ogarrett added the proposal An issue that proposes a feature request label Mar 11, 2021

github-actions bot added the stale Pull requests/issues with no activity label May 29, 2021

pleshakov removed the stale Pull requests/issues with no activity label Jun 1, 2021

brianehlert added this to the 1.14.0-k8s-ingress-controller milestone Jul 19, 2021

brianehlert modified the milestones: 1.14.0-k8s-ingress-controller, 1.15.0-k8s-ingress-controller Aug 24, 2021

brianehlert modified the milestones: 2.2.0-k8s-ingress-controller, 2.3.0-k8s-ingress-controller Dec 13, 2021

brianehlert modified the milestones: 2.3.0-k8s-ingress-controller, 2.4.0-k8s-ingress-controller Feb 14, 2022

This was referenced Sep 2, 2022

Add lifecycle for controller container to helm chart #3005

Merged

Add lifecycle to controller container #3006

Closed

brianehlert removed this from the v2.4.0 milestone Oct 19, 2023

nginxinc locked and limited conversation to collaborators Jul 4, 2024

brianehlert converted this issue into discussion #5962 Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Enter drain mode when a pod is terminating and sticky-sessions are enabled #267

Enter drain mode when a pod is terminating and sticky-sessions are enabled #267

dcowden commented Apr 9, 2018 •

edited by brianehlert

Loading

pleshakov commented Apr 10, 2018

dcowden commented Apr 10, 2018

pleshakov commented Apr 10, 2018

dcowden commented Apr 10, 2018

victor-frag commented Feb 5, 2019

tkgregory commented Feb 18, 2019

dcowden commented Feb 18, 2019

irizzant commented Apr 18, 2019 •

edited

Loading

amodolo commented Jun 26, 2019

dcowden commented Jun 26, 2019

amodolo commented Jun 26, 2019

dcowden commented Jun 26, 2019

amodolo commented Jun 28, 2019

dcowden commented Jun 28, 2019

miclefebvre commented Mar 4, 2020

dcowden commented Mar 4, 2020

miclefebvre commented Mar 4, 2020

deepakkhetwal commented Mar 28, 2021

github-actions bot commented May 29, 2021

alessandroargentieri commented Nov 3, 2021

brianehlert commented Jan 2, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

Enter drain mode when a pod is terminating and sticky-sessions are enabled #267

Enter drain mode when a pod is terminating and sticky-sessions are enabled #267

Comments

dcowden commented Apr 9, 2018 • edited by brianehlert Loading

pleshakov commented Apr 10, 2018

dcowden commented Apr 10, 2018

pleshakov commented Apr 10, 2018

dcowden commented Apr 10, 2018

victor-frag commented Feb 5, 2019

tkgregory commented Feb 18, 2019

dcowden commented Feb 18, 2019

irizzant commented Apr 18, 2019 • edited Loading

amodolo commented Jun 26, 2019

dcowden commented Jun 26, 2019

amodolo commented Jun 26, 2019

dcowden commented Jun 26, 2019

amodolo commented Jun 28, 2019

dcowden commented Jun 28, 2019

miclefebvre commented Mar 4, 2020

dcowden commented Mar 4, 2020

miclefebvre commented Mar 4, 2020

deepakkhetwal commented Mar 28, 2021

github-actions bot commented May 29, 2021

alessandroargentieri commented Nov 3, 2021

brianehlert commented Jan 2, 2024

This issue was moved to a discussion.

dcowden commented Apr 9, 2018 •

edited by brianehlert

Loading

irizzant commented Apr 18, 2019 •

edited

Loading