Skip to content

Commit

Permalink
Listen on the first free port from 9009
Browse files Browse the repository at this point in the history
Partially reverts the backport of #21818 in 0.6.1. Fixes #24722.

Revert client_socket_reuse
  • Loading branch information
amitmurthy committed Nov 27, 2017
1 parent 3522df1 commit 6736f45
Show file tree
Hide file tree
Showing 5 changed files with 14 additions and 9 deletions.
3 changes: 2 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,8 @@ This section lists changes that do not have deprecation warnings.
rather than from environment variables ([#19636]).
* Workers now listen on an ephemeral port assigned by the OS. Previously workers would
listen on the first free port available from 9009 ([#21818]).
listen on the first free port available from 9009 ([#21818]). Version 0.6.1 only.
Reverted in 0.6.2
Library improvements
Expand Down
2 changes: 1 addition & 1 deletion base/distributed/cluster.jl
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ function start_worker(out::IO, cookie::AbstractString)
init_worker(cookie)
interface = IPv4(LPROC.bind_addr)
if LPROC.bind_port == 0
(port, sock) = listenany(interface, UInt16(0))
(port, sock) = listenany(interface, UInt16(9009))
LPROC.bind_port = port
else
sock = listen(interface, LPROC.bind_port)
Expand Down
6 changes: 5 additions & 1 deletion base/distributed/managers.jl
Original file line number Diff line number Diff line change
Expand Up @@ -489,7 +489,11 @@ function bind_client_port(s)
end

function connect_to_worker(host::AbstractString, port::Integer)
s = socket_reuse_port()
# Revert support for now. Client socket port number reuse
# does not play well in a scenario where worker processes are repeatedly
# created and torn down, i.e., when the new workers end up reusing a
# a previous listen port.
s = TCPSocket()
connect(s, host, UInt16(port))

# Avoid calling getaddrinfo if possible - involves a DNS lookup
Expand Down
9 changes: 4 additions & 5 deletions doc/src/manual/parallel-computing.md
Original file line number Diff line number Diff line change
Expand Up @@ -1231,8 +1231,8 @@ as local laptops, departmental clusters, or even the cloud. This section covers
requirements for the inbuilt `LocalManager` and `SSHManager`:

* The master process does not listen on any port. It only connects out to the workers.
* Each worker binds to only one of the local interfaces and listens on an ephemeral port number
assigned by the OS.
* Each worker binds to only one of the local interfaces and listens on the first free port starting
from `9009`.
* `LocalManager`, used by `addprocs(N)`, by default binds only to the loopback interface. This means
that workers started later on remote hosts (or by anyone with malicious intentions) are unable
to connect to the cluster. An `addprocs(4)` followed by an `addprocs(["remote_host"])` will fail.
Expand All @@ -1250,9 +1250,8 @@ requirements for the inbuilt `LocalManager` and `SSHManager`:
authenticated via public key infrastructure (PKI). Authentication credentials can be supplied
via `sshflags`, for example ```sshflags=`-e <keyfile>` ```.

In an all-to-all topology (the default), all workers connect to each other via plain TCP sockets.
The security policy on the cluster nodes must thus ensure free connectivity between workers for
the ephemeral port range (varies by OS).
Note that worker-worker connections are still plain TCP and the local security policy on the remote
cluster must allow for free connections between worker nodes, at least for ports 9009 and above.

Securing and encrypting all worker-worker traffic (via SSH) or encrypting individual messages
can be done via a custom ClusterManager.
Expand Down
3 changes: 2 additions & 1 deletion test/distributed_exec.jl
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@ if is_unix()
s = TCPSocket(delay = false)
is_linux() && Base.Distributed.bind_client_port(s)
if ccall(:jl_tcp_reuseport, Int32, (Ptr{Void},), s.handle) == 0
reuseport_tests()
# Client reuse port has been disabled in 0.6.2
# reuseport_tests()
else
info("SO_REUSEPORT is unsupported, skipping reuseport tests.")
end
Expand Down

0 comments on commit 6736f45

Please sign in to comment.