Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should be able to have clusters with dedicated brokers and data nodes #1934

Closed
pauldix opened this issue Mar 12, 2015 · 8 comments · Fixed by #2175
Closed

Should be able to have clusters with dedicated brokers and data nodes #1934

pauldix opened this issue Mar 12, 2015 · 8 comments · Fixed by #2175
Assignees
Milestone

Comments

@pauldix
Copy link
Member

pauldix commented Mar 12, 2015

Currently, all servers in a cluster come up as both a broker and a data node. We should be able to spin up servers only as brokers and have other servers come up only as data nodes.

@corylanou corylanou self-assigned this Mar 12, 2015
@beckettsean beckettsean modified the milestones: 0.9.0, Next Release Mar 14, 2015
@pauldix pauldix modified the milestones: 0.9.0, Next Point Release Mar 14, 2015
@toddboom toddboom modified the milestone: 0.9.0 Mar 14, 2015
@pauldix pauldix added this to the 0.9.0 milestone Mar 25, 2015
@otoolep
Copy link
Contributor

otoolep commented Mar 25, 2015

We should consider completely redoing the current code in this area. The use of directories to indicate what "role" is a node in has caused lots of problems in the implementation. I think we should consider a simple file on disk that states "I'm a broker, I'm a data node, or I'm both". In other words, write the "role" to disk, and switch on the contents of that file on start-up. If the file does not exist, assume "combined".

Right now the code in this area is pretty unclear. An explicit "role" file may be the answer.

@jwilder
Copy link
Contributor

jwilder commented Apr 1, 2015

Based on #1426, this a proposed design change that we'd like feedback on.

Currently we have a Data, Broker, Snapshot and Admin Port which are all bound on a common bind address. There is also a mix of public APIs endpoints (/query, /write, etc..) commingled w/ cluster communication endpoints (/data_nodes_index, /data_nodes_create, etc..) that are served over the current Data port. This makes it difficult to limit access to cluster communication endpoints while allowing more open access to the public API. In addition, it may be desirable to segregate public API and internal cluster communication on separate network interfaces for performance and additional security. This is currently not possible.

We propose the following changes:

  • Bind-Address - Default bind address used by listeners. This can be overridden for each port below to support cluster communication only on an internal interface but allow API traffic on a public interface.
  • Cluster Port [bind-address]:8085 - The port used for all inter-node communication. This would include the:
    • Broker: /raft, /messaging
    • Data Node: /data_nodes_*, /metastore, /process_continuous_queries
  • Admin Port [bind-address]:8083
    • Current Admin UI
    • Snapshot: (currently a separate snapshot port)
  • API Port [bind-address]:8086
    • DataNode: /write, /query, /ping, /dump

With this proposal, we get the following:

  • All cluster communication occurs over a dedicated port (default 8085).
  • All public API endpoints are exposed over a dedicated port (default 8086)
  • Ability to separate cluster and public API traffic on different network interfaces
  • Ability to run all endpoints on the same interface and port if desired

cc @pauldix @benbjohnson @otoolep

@benbjohnson
Copy link
Contributor

I'd rather support the simple common use case first and allow users to separate out ports as needed. Most people will probably stand up an influxd server behind a firewall and just let internal servers hit it. In that case it makes more sense to have one single port that users can open to select servers. Then they can break off individual services on different ports as needed. e.g. start everything on :8086 and let them break off from there.

I do like the separation of cluster port, admin port, & API port though. That seems like a reasonable separation.

@otoolep
Copy link
Contributor

otoolep commented Apr 1, 2015

I too see the need for separate ports, and like the ideas above. However I don't see why we have the Admin UI on a different port. It should be on the API port, as far as I can see. I don't see any advantages to having it on a different port.

@jwilder
Copy link
Contributor

jwilder commented Apr 2, 2015

If we remove the admin port, the admin interface will need to be served from a /admin or similar. @toddboom any concerns about that?

@pauldix
Copy link
Member Author

pauldix commented Apr 2, 2015

@jwilder I think this design makes a lot of sense. I've actually had issue #1426 open about this for a while. The only change I'd make is that I'd put the /dump endpoint in the admin group. That one can end up with a ton of data and it's probably not something they'd want in the publicly available API.

jwilder added a commit that referenced this issue Apr 2, 2015
This is a pre-requisite for #1934.  When running separate
broker and data nodes, you currently need to know what role
a host is performing.  This complicates cluster setup in
that you must configure separate broker URLs and data node
URLs.

This change allows a broker only node to redirect data nodes endpoints
to a valid data node and a data only node to redirect broker
endpoints to a valid broker.
@toddboom
Copy link
Contributor

toddboom commented Apr 3, 2015

@jwilder no issues for me on the admin interface - i think it would make some people happier

@jwilder
Copy link
Contributor

jwilder commented Apr 3, 2015

Ok. We'll use this as the target design for this issue.

jwilder added a commit that referenced this issue Apr 4, 2015
This is a pre-requisite for #1934.  When running separate
broker and data nodes, you currently need to know what role
a host is performing.  This complicates cluster setup in
that you must configure separate broker URLs and data node
URLs.

This change allows a broker only node to redirect data nodes endpoints
to a valid data node and a data only node to redirect broker
endpoints to a valid broker.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants