Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server,config,cli: offer a separate port for SQL clients #39305

Merged
merged 3 commits into from
Aug 8, 2019

Conversation

knz
Copy link
Contributor

@knz knz commented Aug 4, 2019

(First two commits from #39452)
Fixes #30828.
Fixes #5816.

his patch introduces the ability to split off the SQL server into a
separate port, using the new command-line flag --sql-addr.

The remainder of this commit message details:

  • the motivation for the change
  • some usage documentation for users of the new feature
  • how to introduce this feature in existing clusters
  • reading recommendations to reviewers of this change
  • release note

Motivation

This is a long- and oft-requested feature, aimed at facilitating
deployments in professional networks that use firewalls to fence off
"internal" (server-side) traffic from external (client) traffic.

Usage

How it works (simplified):

  • the flag --sql-addr indicates on which host/port to listen to for
    SQL connections.
  • it is possible to specify --sql-addr to be equal to
    --listen-addr, in which case both will share a single TCP
    connection (internally: using cmux).
  • in fact, the default for --sql-addr is to be equal to
    --listen-addr, for compatibility with previous versions
    of CockroachDB.
  • when --sql-addr and --listen-addr are different, then
    the server does not accept SQL connections any more on the
    --listen-addr address.
  • the flag --advertise-sql-addr complements --sql-addr in
    the same way as --advertise-addr complements --listen-addr.

In addition, the output of cockroach node status (and the contents
of crdb_internal.gossip_nodes) is extended to display the SQL
address.

Note (intermediate): the new flags enables both using separate ports
on the same host address (e.g. listening on 127.0.0.1 with separate
ports for SQL and RPC), and also using the same port on separate
host addresses (e.g. listening on 127.0.0.1:26257 for SQL, and
192.168.2.123:26257 for RPC). Both use cases are legitimate
and have seen demand in the wild.

Note (advanced): the computation of defaults is performed separately
for the host and port part of the flag. The logic is the same as that
used for --http-addr, using default port 26257. For example,
--sql-addr=localhost (without port number) is equivalent to
--sql-addr=localhost:26257.

Note (advanced): the computation of defaults for
--advertise-sql-addr is a bit non-trivial as it pulls its address
and port part separately from --sql-addr, --listen-addr and
--advertise-addr. Effort was made to ensure sane defaults when some
flags but not all are omitted. This is best explained by examples, see
the unit tests in flags_test.go for details.

Upgrading a cluster to use separate ports

If a cluster is already online and the need arises to split the ports,
the flag can be introduced as follows:

  • when keeping port 26257 for SQL:

    1. restart all nodes in a rolling fashion, updating the --join
      flag on each node to add the new RPC address.

    2. restart all nodes in a rolling fashion, adding
      --sql-addr=:26257 and setting --listen-addr=xxx to the new
      RPC address. The previous address with port 26257 can also be
      removed from --join in the same step.

  • when keeping port 26257 for RPC, introducing a new SQL addr/port:

    1. if load balancers are in use, extend the load balancer
      config to also attempt connections on the new SQL address.
      If load balancers are not in use, temporarily add one
      that accepts clients on the old addr/port and redirects the
      connection on both the new addr/port and the old.

    2. restart all nodes in a rolling fashion, updating the --sql-addr
      flag.

Review notes

This change was constructed as follows:

  1. introducing new fields in base.Config
  2. implementing the address validation logic
    in addr_validation.go and corresponding unit tests.
  3. picking up the new fields to listen separately
    in (s *Server) startListenRPCAndSQL()
  4. adding the command line flag parsing and default
    logic in flags.go and corresponding unit tests.
  5. manually testing that indeed the server can
    listen separately on separate ports.
  6. to ensure that most unit tests also exercise
    the split ports, make TestServer/TestCluster
    set the new flags SplitListenSQL
  7. extend the TestServer interface to make the SQL
    address available alongside the RPC address.
  8. update all the tests using a SQL connection
    to use the SQL address
  9. update the CLI test logic from cli_test.go
    to configure the --host flag to the RPC
    or SQL address depending on the command
    being invoked.

At this point all tests except for TestZip would pass
successfully. The remainder of the work is for the benefit of
cockroach zip, which needs to discover the reachable address of
every node in a cluster from the address of just 1 of them, doing so
by inspecting all the node descriptors. cockroach zip is also
peculiar in that it uses both the RPC and SQL interfaces.

  1. equip the node descriptor with a field for the SQL address
    and ensure it gets populated.
  2. make the zip logic fetch the SQL address of the primary
    node by a Node() status request over RPC.
  3. for the loop over all nodes, use the SQL address
    in each node's descriptor for SQL queries separately
    from the RPC logic.

Release note (cli change): CockroachDB now recognizes a flag
--sql-addr which makes it possible to accept connections by clients
on a separate TCP address and/or port number from the one used
for intra-cluster (node-node) connections. This is aimed to enable
firewalling client traffic from server traffic.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz force-pushed the 20190803-conn branch 2 times, most recently from 3808943 to f2bd945 Compare August 5, 2019 04:34
@knz knz changed the title [DNM/WIP] server start refactor/improvements server,config,cli: make it possible to use a separate port for SQL clients Aug 5, 2019
@knz knz changed the title server,config,cli: make it possible to use a separate port for SQL clients server,config,cli: offer a separate port for SQL clients Aug 5, 2019
@knz knz requested review from mberhault and bdarnell August 5, 2019 04:35
@bdarnell
Copy link
Contributor

bdarnell commented Aug 5, 2019

However, for backward compatibility with pre-19.2 clients, we cannot
both implement option A and make --sql-addr=:5432 the new default
configuration. This would force every client to change the port number
they use to connect.

I think we'd want to keep our one IANA-registered port the public-facing sql interface; we're going to have a lot of docs specifying :26257. So the backwards-compatibility issues are not with the sql port, but the grpc port (I'm not sure what port we'd use for grpc; we're very unlikely to be able to get another one from IANA. I'll say 26258 for now). This has a reasonable transition path:

  1. Restart all nodes with join flags that specify both :26257 and :26258 for the join targets.
  2. Restart all nodes with --sql-addr=:26257 --listen-addr=:26258 --advertise-addr=$(hostname):26258
  3. Optionally and lazily remove :26257 from the join flags

Contrary to what I just said at lunch, I don't think we need a backwards-compatibility mode that accepts sql connections on two ports at a time.

@knz
Copy link
Contributor Author

knz commented Aug 6, 2019

I think we'd want to keep our one IANA-registered port the public-facing sql interface; we're going to have a lot of docs specifying :26257. So the backwards-compatibility issues are not with the sql port, but the grpc port (I'm not sure what port we'd use for grpc; we're very unlikely to be able to get another one from IANA. I'll say 26258 for now).

Good idea. Done. Since the default is to use the same port number, there is no need to provide a default for the separate port - the user will specify the value every time.

This has a reasonable transition path: [...]

Agreed. Updated the commit message accordingly.

@knz knz force-pushed the 20190803-conn branch 4 times, most recently from c94e358 to 40dd076 Compare August 7, 2019 23:35
@knz knz requested a review from tbg August 7, 2019 23:36
@knz
Copy link
Contributor Author

knz commented Aug 7, 2019

@mberhault please review the two commits, especially the shuffling around of the network code and the additional comments.

@tbg @bdarnell I have made my best to explain what's going on in the commit message, I hope this makes the review slightly more approachable.

@tbg
Copy link
Member

tbg commented Aug 8, 2019 via email

@knz
Copy link
Contributor Author

knz commented Aug 8, 2019

@tbg we haven't chat about this. I'll extract the first 2 commits in a separate PRs so that it's more approachable by Marc. Then for the main/last commit here I'm thinking about approaching MattJ (unless ben makes time), what do you think?

knz added 2 commits August 8, 2019 17:11
the `netutil.Server` is really a multi-purpose thing. This
patch extends the comment to explain what it does in more details.

Release note: None
This patch performs the following cleanups and refactors:

- adds explanatory comments;
- factors the call to net.Listen and starting the server
  for the admin UI into a new method `(*Server).startServeUI()`.
- factors the call to net.Listen and starting the RPC server
  into a new method `(*Server).startServeRPC()`.
- factors the start of the SQL server into a new method
  `(*Server).startServeSQL()`.
- extracts the TCP keepalive configuration function
  into a top-level function (was a closure).

Release note: None
Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 8 of 8 files at r4, 33 of 67 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @knz, @mberhault, and @tbg)


pkg/cli/flags.go, line 652 at r7 (raw file):

	serverCfg.SplitListenSQL = cmd.Flags().Lookup(cliflags.ListenSQLAddr.Name).Changed

	// Fill in the defaults for --sql-advertise-addr.

There's no longer a --sql-advertise-addr flag, right?

Unfortunately we might need one. I think it's going to be common for the sql and rpc listeners to be on separate network interfaces (that's what MSO would do I think). It might work OK in practice because the RPC listener would be on a specific interface and the SQL listener could be on all interfaces, but someone might need the flexibility (maybe we wait to introduce the flag until we have an actual indication of demand?)

I thought after our last conversation we were going to keep listening for SQL on the RPC port and that would avoid the need for a separate SQLAdvertiseAddr. Why did you come back to advertising this address?

@knz
Copy link
Contributor Author

knz commented Aug 8, 2019

There's no longer a --sql-advertise-addr flag, right?

There is! --advertise-sql-addr it's defined above.
The default logic you're looking at is used when that flag is not specified, but it's available.

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 34 of 67 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz, @mberhault, and @tbg)


pkg/cli/flags.go, line 652 at r7 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

There's no longer a --sql-advertise-addr flag, right?

Unfortunately we might need one. I think it's going to be common for the sql and rpc listeners to be on separate network interfaces (that's what MSO would do I think). It might work OK in practice because the RPC listener would be on a specific interface and the SQL listener could be on all interfaces, but someone might need the flexibility (maybe we wait to introduce the flag until we have an actual indication of demand?)

I thought after our last conversation we were going to keep listening for SQL on the RPC port and that would avoid the need for a separate SQLAdvertiseAddr. Why did you come back to advertising this address?

Ack. So just s/sql-advertise/advertise-sql/ here.

This patch introduces the ability to split off the SQL server into a
separate port, using the new command-line flag `--sql-addr`.

The remainder of this commit message details:
- the motivation for the change
- some usage documentation for users of the new feature
- how to introduce this feature in existing clusters
- reading recommendations to reviewers of this change
- release note

**Motivation**

This is a long- and oft-requested feature, aimed at facilitating
deployments in professional networks that use firewalls to fence off
"internal" (server-side) traffic from external (client) traffic.

**Usage**

How it works (simplified):

- the flag `--sql-addr` indicates on which host/port to listen to for
  SQL connections.
- it is possible to specify `--sql-addr` to be equal to
  `--listen-addr`, in which case both will share a single TCP
  connection (internally: using `cmux`).
- in fact, the default for `--sql-addr` is to be equal to
  `--listen-addr`, for compatibility with previous versions
  of CockroachDB.
- when `--sql-addr` and `--listen-addr` are different, then
  the server does not accept SQL connections any more on the
  `--listen-addr` address.
- the flag `--advertise-sql-addr` complements `--sql-addr` in
  the same way as `--advertise-addr` complements `--listen-addr`.

In addition, the output of `cockroach node status` (and the contents
of `crdb_internal.gossip_nodes`) is extended to display the SQL
address.

Note (intermediate): the new flags enables both using separate ports
on the same host address (e.g. listening on 127.0.0.1 with separate
ports for SQL and RPC), and also using the same port on separate
host addresses (e.g. listening on 127.0.0.1:26257 for SQL, and
192.168.2.123:26257 for RPC). Both use cases are legitimate
and have seen demand in the wild.

Note (advanced): the computation of defaults is performed separately
for the host and port part of the flag. The logic is the same as that
used for `--http-addr`, using default port 26257. For example,
`--sql-addr=localhost` (without port number) is equivalent to
`--sql-addr=localhost:26257`.

Note (advanced): the computation of defaults for
`--advertise-sql-addr` is a bit non-trivial as it pulls its address
and port part separately from `--sql-addr`, `--listen-addr` and
`--advertise-addr`. Effort was made to ensure sane defaults when some
flags but not all are omitted. This is best explained by examples, see
the unit tests in flags_test.go for details.

**Upgrading a cluster to use separate ports**

If a cluster is already online and the need arises to split the ports,
the flag can be introduced as follows:

- when keeping port 26257 for SQL:

  1. restart all nodes in a rolling fashion, updating the `--join`
     flag on each node to add the new RPC address.

  2. restart all nodes in a rolling fashion, adding
     `--sql-addr=:26257` and setting `--listen-addr=xxx` to the new
     RPC address. The previous address with port 26257 can also be
     removed from `--join` in the same step.

- when keeping port 26257 for RPC, introducing a new SQL addr/port:

  1. if load balancers are in use, extend the load balancer
     config to also attempt connections on the new SQL address.
	 If load balancers are not in use, temporarily add one
	 that accepts clients on the old addr/port and redirects the
	 connection on both the new addr/port and the old.

  2. restart all nodes in a rolling fashion, updating the `--sql-addr`
     flag.

**Review notes**

This change was constructed as follows:

1. introducing new fields in `base.Config`
2. implementing the address validation logic
   in `addr_validation.go` and corresponding unit tests.
3. picking up the new fields to listen separately
   in `(s *Server) startListenRPCAndSQL()`
4. adding the command line flag parsing and default
   logic in `flags.go` and corresponding unit tests.
5. manually testing that indeed the server can
   listen separately on separate ports.
6. to ensure that most unit tests also exercise
   the split ports, make TestServer/TestCluster
   set the new flags `SplitListenSQL`
7. extend the TestServer interface to make the SQL
   address available alongside the RPC address.
8. update all the tests  using a SQL connection
   to use the SQL address
9. update the CLI test logic from `cli_test.go`
   to configure the `--host` flag to the RPC
   or SQL address depending on the command
   being invoked.

At this point all tests except for `TestZip` would pass
successfully. The remainder of the work is for the benefit of
`cockroach zip`, which needs to discover the reachable address of
every node in a cluster from the address of just 1 of them, doing so
by inspecting all the node descriptors. `cockroach zip` is also
peculiar in that it uses both the RPC and SQL interfaces.

1. equip the node descriptor with a field for the SQL address
   and ensure it gets populated.
2. make the zip logic fetch the SQL address of the primary
   node by a Node() status request over RPC.
3. for the loop over all nodes, use the SQL address
   in each node's descriptor for SQL queries separately
   from the RPC logic.

Release note (cli change): CockroachDB now recognizes a flag
`--sql-addr` which makes it possible to accept connections by clients
on a separate TCP address and/or port number from the one used
for intra-cluster (node-node) connections. This is aimed to enable
firewalling client traffic from server traffic.
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @mberhault, and @tbg)


pkg/cli/flags.go, line 652 at r7 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Ack. So just s/sql-advertise/advertise-sql/ here.

Done.

@knz
Copy link
Contributor Author

knz commented Aug 8, 2019

TFYR!

bors r+

craig bot pushed a commit that referenced this pull request Aug 8, 2019
39305: server,config,cli: offer a separate port for SQL clients r=knz a=knz

(First two commits from #39452)
Fixes #30828.
Fixes #5816.

his patch introduces the ability to split off the SQL server into a
separate port, using the new command-line flag `--sql-addr`.

The remainder of this commit message details:
- the motivation for the change
- some usage documentation for users of the new feature
- how to introduce this feature in existing clusters
- reading recommendations to reviewers of this change
- release note

**Motivation**

This is a long- and oft-requested feature, aimed at facilitating
deployments in professional networks that use firewalls to fence off
"internal" (server-side) traffic from external (client) traffic.

**Usage**

How it works (simplified):

- the flag `--sql-addr` indicates on which host/port to listen to for
  SQL connections.
- it is possible to specify `--sql-addr` to be equal to
  `--listen-addr`, in which case both will share a single TCP
  connection (internally: using `cmux`).
- in fact, the default for `--sql-addr` is to be equal to
  `--listen-addr`, for compatibility with previous versions
  of CockroachDB.
- when `--sql-addr` and `--listen-addr` are different, then
  the server does not accept SQL connections any more on the
  `--listen-addr` address.
- the flag `--advertise-sql-addr` complements `--sql-addr` in
  the same way as `--advertise-addr` complements `--listen-addr`.

In addition, the output of `cockroach node status` (and the contents
of `crdb_internal.gossip_nodes`) is extended to display the SQL
address.

Note (intermediate): the new flags enables both using separate ports
on the same host address (e.g. listening on 127.0.0.1 with separate
ports for SQL and RPC), and also using the same port on separate
host addresses (e.g. listening on 127.0.0.1:26257 for SQL, and
192.168.2.123:26257 for RPC). Both use cases are legitimate
and have seen demand in the wild.

Note (advanced): the computation of defaults is performed separately
for the host and port part of the flag. The logic is the same as that
used for `--http-addr`, using default port 26257. For example,
`--sql-addr=localhost` (without port number) is equivalent to
`--sql-addr=localhost:26257`.

Note (advanced): the computation of defaults for
`--advertise-sql-addr` is a bit non-trivial as it pulls its address
and port part separately from `--sql-addr`, `--listen-addr` and
`--advertise-addr`. Effort was made to ensure sane defaults when some
flags but not all are omitted. This is best explained by examples, see
the unit tests in flags_test.go for details.

**Upgrading a cluster to use separate ports**

If a cluster is already online and the need arises to split the ports,
the flag can be introduced as follows:

- when keeping port 26257 for SQL:

  1. restart all nodes in a rolling fashion, updating the `--join`
     flag on each node to add the new RPC address.

  2. restart all nodes in a rolling fashion, adding
     `--sql-addr=:26257` and setting `--listen-addr=xxx` to the new
     RPC address. The previous address with port 26257 can also be
     removed from `--join` in the same step.

- when keeping port 26257 for RPC, introducing a new SQL addr/port:

  1. if load balancers are in use, extend the load balancer
     config to also attempt connections on the new SQL address.
	 If load balancers are not in use, temporarily add one
	 that accepts clients on the old addr/port and redirects the
	 connection on both the new addr/port and the old.

  2. restart all nodes in a rolling fashion, updating the `--sql-addr`
     flag.

**Review notes**

This change was constructed as follows:

1. introducing new fields in `base.Config`
2. implementing the address validation logic
   in `addr_validation.go` and corresponding unit tests.
3. picking up the new fields to listen separately
   in `(s *Server) startListenRPCAndSQL()`
4. adding the command line flag parsing and default
   logic in `flags.go` and corresponding unit tests.
5. manually testing that indeed the server can
   listen separately on separate ports.
6. to ensure that most unit tests also exercise
   the split ports, make TestServer/TestCluster
   set the new flags `SplitListenSQL`
7. extend the TestServer interface to make the SQL
   address available alongside the RPC address.
8. update all the tests  using a SQL connection
   to use the SQL address
9. update the CLI test logic from `cli_test.go`
   to configure the `--host` flag to the RPC
   or SQL address depending on the command
   being invoked.

At this point all tests except for `TestZip` would pass
successfully. The remainder of the work is for the benefit of
`cockroach zip`, which needs to discover the reachable address of
every node in a cluster from the address of just 1 of them, doing so
by inspecting all the node descriptors. `cockroach zip` is also
peculiar in that it uses both the RPC and SQL interfaces.

1. equip the node descriptor with a field for the SQL address
   and ensure it gets populated.
2. make the zip logic fetch the SQL address of the primary
   node by a Node() status request over RPC.
3. for the loop over all nodes, use the SQL address
   in each node's descriptor for SQL queries separately
   from the RPC logic.

Release note (cli change): CockroachDB now recognizes a flag
`--sql-addr` which makes it possible to accept connections by clients
on a separate TCP address and/or port number from the one used
for intra-cluster (node-node) connections. This is aimed to enable
firewalling client traffic from server traffic.

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Aug 8, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

server: Three ports to rule them all cli+server: binding to multiple specific interfaces
5 participants