-
-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider adding more docs on RethinkDB Proxy #962
Comments
The main documentation is the (short) section called Running a proxy node in the "Scaling, sharding and replication" document; that's also linked under the Proxy nodes heading in "Optimizing query performance," from the end of the "Command line options" document, and from Scaling considerations in the main Changefeed documentation. That section is relatively short, and if there's stuff that you think it's missing we can add it. (For instance, it doesn't address "what the network config needs to look like.") And if there are other places we need to link it, we can. Unless there's a lot missing I'm not sure whether this really needs to be in its own document, but if you don't think it belongs under "Scaling" maybe there's a case for a better location. |
OK, cool! I actually missed that while look for docs somehow. I think once rethinkdb/rethinkdb#5138 is in we should update that section with information on how to automatically start a proxy node (and probably also add some information about the network configuration while we're in there). Assigning to myself in the meantime. |
@mlucy This is currently assigned to you. Are you still planning to write something up for this or should we reassign? |
I'm going to go ahead and reassign this to myself in order to get some details together. @mlucy Please complain if you have started writing anything up for this. |
@danielmewes -- I never did anything on this. We should probably fix rethinkdb/rethinkdb#5138 while we're at this though. |
@chipotle Here's a write up of some of the things that we should probably cover: What is a RethinkDB proxy? A RethinkDB proxy is a RethinkDB server that doesn't store any persistent state, but A typical use case is running a RethinkDB proxy instance locally on each application TODO: Maybe insert a figure like the following one that compares a setup without and with proxies.
More precisely, a proxy:
However, a RethinkDB proxy still provides the following features:
To start a RethinkDB proxy, you can run When should I use a RethinkDB proxy? Primary use cases include:
We will see in the next section how proxy servers can achieve these objectives. How can a proxy improve performance? With a proxy running locally on each application Since proxies also perform certain query processing steps themselves, they can also help Changefeeds In addition to regular queries, a proxy can also be a very powerful tool for scaling A proxy will manage changefeeds locally, and reduce the overhead (RAM, CPU and network) This even works if the clients are listening to different selections on the table, such How do I run it? A proxy has fewer requirements to the system's hardware compared to a regular database That being said, proxies still benefit from fast CPUs, and require enough RAM to process Note that in contrast to a regular client that can connect to a single server, a proxy If all requirements are met, you can run a proxy through the TODO: Mention/Describe rethinkdb/rethinkdb#5138 once available TODO: Maybe describe a few "best practice" scenarios in more detail. E.g. |
This is fantastic!! Things to maybe expand on:
|
Thanks for the feedback @williamstein . Very useful. We'll try to incorporate that. I think connection pools are still useful, because the proxy is still going to be able to utilize multiple cores better with multiple client connections. |
What about multiple client connections enables better cpu usage in a proxy? |
@hamiltop Yeah that's what I meant. :-) |
@danielmewes Sorry, I was asking a question. Why does that enable better cpu usage? What aspect of multiple client connections leads to more cpu usage? |
Ah, sorry. Each incoming client connection is assigned to one CPU core randomly (or actually round robin I think). A lot of work for any query run through that connection is going to happen on that core. So by using multiple connections and spreading queries across them, you can better utilize multiple CPU cores on the proxy. |
Interesting. Is that true for normal cluster connections? (not just proxies) |
This is true for normal servers as well. However on a normal server, there are more tasks that are not depending on On Mon, Mar 14, 2016 at 2:26 PM, Peter Hamilton notifications@github.com
|
I think most of the information is here. Handing over to @chipotle . |
It would also be great to see some recommendations on starting the proxy and restarting it if the process dies or if the server is rebooted. Looking at rethinkdb/rethinkdb#5138 there's a suggestion that this can be done by editing the init script. Not being an expert on this I wasn't that confident to jump into /etc/init.d/rethinkdb and start messing around. I ended up using Upstart instead of altering the init script. I wrote up some notes as I couldn't find an explanation anywhere of how to do this. I'd welcome feedback on the approach I'm certainly not experienced with this. |
Thanks for sharing your notes on this @brucepom ( https://medium.com/@brucepomeroy/running-a-rethinkdb-proxy-on-ubuntu-68f8cd308b7b ). While we should fix this more generally in the mid-term (rethinkdb/rethinkdb#5138), it might be nice to mention how to add the upstart script for the meantime in our docs. @chipotle do you think that's something we could incorporate? |
I setup a Rethinkdb proxy on a separate node rather than running proxy in app server itself. So my app will contact proxy node which in turn fetch the data from RethinkDB cluster. Is there anyway to figure if the query processing is actually happening on proxy machine? From netstat command i see that my proxy node is connected to some unknow IP (This IP i didn't used/configured anywhere in the network) on port 28015 apart from the cluster nodes. |
Hi, |
Here's my attempt to start RethinkDB as a proxy node using systemd in Ubuntu 16.04. Feel free to add to it...
Other useful things... Check the status
Tail the log
|
@bbar Could you have a pull request for this? |
Yes and yes
--
Regards,
Atri
*l'apprenant*
|
I feel like I've been asked about it a lot in the last month, and as far as I can tell we only talk about it as an aside in the changefeed docs. It might be worth adding a page that talks more about the subject (I'm not quite sure where it would go in our existing scheme).
We should probably mention:
rethinkdb proxy
on system startup. rethinkdb#5138 is resolved we should add docs on how to start a proxy node on system startup too.@chipotle, any thoughts on whether this is worthwhile and where it should go?
The text was updated successfully, but these errors were encountered: