Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make connections sticky to a node instance behind the LB #207

Closed
wants to merge 1 commit into from

Conversation

crzidea
Copy link

@crzidea crzidea commented Jan 3, 2014

Load balancers can use worker id to target a specific node instance behind the LB. Just add a workerId option like this:

var io = eio.attach(httpServer, {workerId: cluster.worker.id});

See also socketio/engine.io-client#220

@defunctzombie
Copy link
Contributor

Could you explain this PR and why it can't be done any other way except with support in engine.io ?

@crzidea
Copy link
Author

crzidea commented Jan 6, 2014

I have tried many ways of LB engine.io Socket. But they are not regular supported:

  1. cookie or sticky session can't be supported by IE using cross domain JSONP.
  2. IP address not support proxy well and not reliable enough if too many clients using same IP address.
    Finnally I tried query string based load balancing. It can be config very easily and regular supported.
    And it should be supported in engine.io because all of these steps should be done right after a socket handshake, otherwise client can't let load balacers know which worker (or node instance) should serve this socket. I can't do that only if overide handshake method.

@mokesmokes
Copy link
Contributor

Personally I think this PR is only going half way: if the connection can be made sticky in this manner then the PR should go all the way and support targeting also on the initial request. Then the app developer can have a choice between this targeting scheme and the header based scheme, which must be supported since several Node vendors already implemented header-based LB. As a general note, I'm not sure if a URL param LB scheme is practical across all HTTP methods, so while this may work for engine.io it may not be compatible with the rest of the app - so is this for an edge case?
And why is IP address stickiness not reliable? And can you also look at the source port to LB on?

@defunctzombie
Copy link
Contributor

Have you tried DNS round robin or simply having separate DNS names? a.example.tld, b.example.tld ? Load balancing is not a one size fits all solution as there are many ways to attack this problem. It seems that many of the PRs and issues raised on this repo about load balancing have to do with using a single proxy server for the connection that then proxies to various workers on the backend. There are many ways to even have this particular setup (IP sharding, headers, cookies) and they depend on your needs and possibly imposed requirements. It may be useful just to create a wiki page with various examples on how to load balance under different circumstances. I am not confident a library patch is going to solve this.

@mokesmokes
Copy link
Contributor

I am now actually convinced that query string is the way to go, for the client to hint to the LB what to target. However, the only thing that needs to be done is to be able to set a new query string on open(), and not just in the constructor. Everything else should be done outside of engine.io. So it should be a 2-3 line PR in the client, nothing server side.

@crzidea
Copy link
Author

crzidea commented Jan 7, 2014

@mokesmokes Could you please explain how to make client know which worker should connect on the first time handshaking? And could this way actually solve your problem?
@defunctzombie In general, engine.io-client must GET 3 times to make a stable websocket connection. So how to make sure all these 3 requests be handled by the same worker while use DNS round robin?

@defunctzombie
Copy link
Contributor

@crzidea depends on your setup :) It will always depend on your setup. If you want to hop on #socket.io and ping me about it, I am happy to brainstorm ideas.

@mokesmokes
Copy link
Contributor

@crzidea very simple - I'm already implementing it in my app:

  1. client issues GET myapp.com/getServerID?p1=qaz&p2=wsx etc. Server replies with {"serverId" : "someTagYourLoadBalancerKnows"}. Note this request is load balanced randomly to one of my servers which does a lookup to get the serverID - I don't care where it goes. I care about my engine.io connection.
  2. engine.io open request with query string "serverId=someTagYourLoadBalancerKnows"

@crzidea
Copy link
Author

crzidea commented Jan 7, 2014

That is a way actually many apps are going. But this just add another request to our servers which should avoid.
Maybe I should close this PR if no one want this feature with engine.io.

@mokesmokes
Copy link
Contributor

@crzidea is the engine.io open request really your first app request? If not, stick this lookup in one of the earlier GET requests and that's it. You can even stick the serverId in your rendered template, etc.

@crzidea
Copy link
Author

crzidea commented Jan 7, 2014

Yes, the open request really is my first request. I even have no HTML stuff to do in my case. It sounds strange, but it really does.

@mokesmokes
Copy link
Contributor

Well, in this case you may have no choice but do the extra initial request. But look at the bright side: you have a real opportunity to make an informed decision which server to target - your overall performance and scalability may actually be higher.

@mmastrac
Copy link

mmastrac commented Jan 7, 2014

I've been using the following with haproxy and engine.io 0.3.9 (an older version, but has this changed?) with great success:

stick-table type string len 32 size 400K expire 10m
stick on url_param(uid)

@rauchg
Copy link
Contributor

rauchg commented Jan 7, 2014

@mmastrac would be cool to have that in a wiki article ("Setting up engine.io with HAProxy")

@mmastrac
Copy link

mmastrac commented Jan 7, 2014

Definitely... the code is still running today and was benchmarked against 12k+ real-life connections on a major content company's site.

The only thing is that automated UID generation on connections seems to have been removed in this commit, which was essential to making it work properly:

socketio/engine.io-client@04b3da0

I haven't tested recently, but setting opts.query.uid = (something) might be a good replacement.

@rauchg
Copy link
Contributor

rauchg commented Jan 7, 2014

Right, the goal was to move it to userland.

@mokesmokes
Copy link
Contributor

So please accept this one :)
socketio/engine.io-client#221

@crzidea
Copy link
Author

crzidea commented Jan 8, 2014

@mmastrac That is actually what I want. Thank you so much!

@crzidea crzidea closed this Jan 8, 2014
mokesmokes pushed a commit to mokesmokes/engine.io that referenced this pull request Jan 26, 2014
mokesmokes pushed a commit to mokesmokes/engine.io that referenced this pull request Feb 2, 2014
@peteruithoven
Copy link

@mmastrac, could you clarify if your solution works (using uid in HAProxy)?

Looking at the query string in a browser I don't see this uid anymore, but I do see sid (socket id), is this something we can use in HAProxy? This sid is received from the server on the first request and is then always added to requests.
If this sid isn't usefull, would a user specific url parameter be useful? This is something we use in an API for authentication, it would be great if we can reuse this for sticky sessions.

@mmastrac
Copy link

mmastrac commented Nov 4, 2014

@peteruithoven -- I mentioned this in the comment above, but automatic UID generation was removed from socket.io core. You'll have to set opts.query.uid to something before your connection is made which will restore the functionality.

@peteruithoven
Copy link

@mmastrac, is this something the client has to do? Because that's hard to explain to API users.

Since Socket.io depends on sticky sessions it would be great if they could implement this in their socket.io-client. But then again, I understand their reasoning in wanting to define the socket id on the server (to prevent conflicts).

I'll try setting up sticky sessions using the existing sid or our own user specific key (which we also use for authentication) and report back.

@peteruithoven
Copy link

We are now using our user authentication url parameter (key) for sticky sessions in HAProxy, seems to work great!
Relevant part of the backend config:

# Use url_param for load balancing (same key goes to same server if avail)
# See http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4.2-balance%20url_param
balance             url_param key

server nodeapp1 127.0.0.1:5000 check
server nodeapp2 127.0.0.1:5001 check
server nodeapp3 127.0.0.1:5002 check
server nodeapp4 127.0.0.1:5003 check

darrachequesne pushed a commit that referenced this pull request May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants