Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sockets convert out-of-range port numbers % 2**16 #68357

Open
KurtRose mannequin opened this issue May 12, 2015 · 7 comments
Open

sockets convert out-of-range port numbers % 2**16 #68357

KurtRose mannequin opened this issue May 12, 2015 · 7 comments
Labels
extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@KurtRose
Copy link
Mannequin

KurtRose mannequin commented May 12, 2015

BPO 24169
Nosy @bitdancer

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2015-05-12.18:50:58.457>
labels = ['type-feature']
title = 'sockets convert out-of-range port numbers % 2**16'
updated_at = <Date 2015-05-13.18:01:42.646>
user = 'https://bugs.python.org/KurtRose'

bugs.python.org fields:

activity = <Date 2015-05-13.18:01:42.646>
actor = 'Kurt.Rose'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2015-05-12.18:50:58.457>
creator = 'Kurt.Rose'
dependencies = []
files = []
hgrepos = []
issue_num = 24169
keywords = []
message_count = 6.0
messages = ['242987', '242992', '242999', '243002', '243027', '243109']
nosy_count = 2.0
nosy_names = ['r.david.murray', 'Kurt.Rose']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue24169'
versions = ['Python 3.5']

@KurtRose
Copy link
Mannequin Author

KurtRose mannequin commented May 12, 2015

This appears to affect all versions of Python. In a behavior inherited from C, TCP ports that are > 2 bytes are silently truncated.

Here is a simple reproduction:

>>> socket.create_connection( ('google.com', 2**16 + 80) )
<socket.socket object, fd=408, family=2, type=1, proto=0>

Needs more investigation, but one likely place to make the fix is here:
https://hg.python.org/cpython/file/9d2c4d887c19/Modules/socketmodule.c#l1535

        if (!PyArg_ParseTuple(args, "O&i:getsockaddrarg",
                              idna_converter, &host, &port))

Instead of parsing port with i, use H. This is a deep change to the behavior, but I think it is very unlikely that users intentionally mean to pass a TCP port > 2**16. More likely, this is silently swallowing errors.

There no indication that the passed port parameter is not being used for the actual TCP communication (which is physically impossible because TCP only has a 2 byte port field).

In fact, the socket will continue to "lie" to the user and obfuscate the actual port being used if getsockname() is called:

>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
('10.225.89.86', 54899)

@KurtRose KurtRose mannequin added the type-bug An unexpected behavior, bug, or error label May 12, 2015
@KurtRose
Copy link
Mannequin Author

KurtRose mannequin commented May 12, 2015

I was incorrect -- the result of getsockname() appears to be some garbage port:

>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
('10.225.89.86', 56446)
>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
('10.225.89.86', 56447)
>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
('10.225.89.86', 56448)
>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
('10.225.89.86', 56449)
>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
('10.225.89.86', 56450)
>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
('10.225.89.86', 56451)
>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
('10.225.89.86', 56452)

Java's stdlib gives a proper error message:

>>> java.net.Socket("google.com", 2**16 + 80)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
        at java.net.InetSocketAddress.<init>(Unknown Source)
        at java.net.Socket.<init>(Unknown Source)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)

    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Sou

rce)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at org.python.core.PyReflectedConstructor.constructProxy(PyReflectedCons
tructor.java:210)

java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: port out
of range:65616

The .NET runtime also rejects invalid ports:

>>> System.Net.IPEndPoint(0x7F000001, 2**16 + 80)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Specified argument was out of the range of valid values.
Parameter name: port

IronPython by extension rejects the invalid port:

>>> socket.create_connection( ('google.com', 2**16 + 80) )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Specified argument was out of the range of valid values.
Parameter name: port

However, Jython recreates the behavior of CPython:

>>> socket.create_connection( ('google.com', 2**16 + 80) ).getsockname()
(u'10.225.89.86', 63071)

@KurtRose
Copy link
Mannequin Author

KurtRose mannequin commented May 12, 2015

Sorry, dumb mistake on my part. I should have been calling getpeername(), not getsockname()

In that case the result is 80:
>>> socket.create_connection( ('google.com', 2**16 + 80) ).getpeername()
('74.125.239.41', 80)

The "random" ports were the client-side ephemeral ports.

@bitdancer
Copy link
Member

You pegged it when you said the behavior is inherited from C. Technically this isn't a bug in Python, since the socket module is a set of (mostly) thin wrappers around the C.

Enhancing CPython to do the check is not a bad suggestion, especially since it seems that other languages and implementations do so. We won't fix this in a maintenance release, though, since it would pointlessly break working code (even if that code is technically buggy).

@bitdancer bitdancer added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels May 12, 2015
@KurtRose
Copy link
Mannequin Author

KurtRose mannequin commented May 12, 2015

Totally agree this needs to be managed carefully. My goal here was just to raise awareness and see if there is consensus that the behavior should be changed.

I came across this because an upstream process had a bug which led to impossible TCP ports being specified. So, for my use case it would have been better if create_connection() had rejected the invalid data.

If we are talking about enhancements to the socket module, it would also be nice if errors included the address :-)

@KurtRose
Copy link
Mannequin Author

KurtRose mannequin commented May 13, 2015

I think this may in fact be a bug. There are other places in the socket module where port is checked, create_connection() just seems to have been missed.

create_connection() and socket.connect() have different behavior:

>>> socket.create_connection( ('google.com', 0x10000 + 80) )
<socket._socketobject object at 0x028B3F48>
>>> socket.socket().connect( ('google.com', 0x10000 + 80) )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
OverflowError: getsockaddrarg: port must be 0-65535.

https://hg.python.org/cpython/file/712f246da49b/Modules/socketmodule.c#l1395

        if (port < 0 || port > 0xffff) {
            PyErr_SetString(
                PyExc_OverflowError,
                "getsockaddrarg: port must be 0-65535.");
            return 0;
        }

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@iritkatriel iritkatriel added the extension-modules C modules in the Modules dir label Dec 1, 2023
@CendioHalim
Copy link

I ran into this issue when converting some code from asyncore to asyncio. When entering an invalid port to asyncore.dispatcher.bind(), the following exception is raised:

Traceback (most recent call last):
  File "~/ascore.py", line 51, in <module>
    server.bind_and_listen()
  File "~/ascore.py", line 47, in bind_and_listen
    self.bind(self.server_address)
  File "/usr/lib64/python3.11/asyncore.py", line 331, in bind
    return self.socket.bind(addr)
           ^^^^^^^^^^^^^^^^^^^^^^
OverflowError: bind(): port must be 0-65535.

When calling asyncio.get_event_loop().create_server(), or socket.create_connection(), the port number seems to get truncated by the call to getaddrinfo(), which means that the address supplied to sock.connect(sa) won't crash.

cpython/Lib/socket.py

Lines 809 to 839 in 4acf825

def create_connection(address, timeout=_GLOBAL_DEFAULT_TIMEOUT,
source_address=None, *, all_errors=False):
"""Connect to *address* and return the socket object.
Convenience function. Connect to *address* (a 2-tuple ``(host,
port)``) and return the socket object. Passing the optional
*timeout* parameter will set the timeout on the socket instance
before attempting to connect. If no *timeout* is supplied, the
global default timeout setting returned by :func:`getdefaulttimeout`
is used. If *source_address* is set it must be a tuple of (host, port)
for the socket to bind as a source address before making the connection.
A host of '' or port 0 tells the OS to use the default. When a connection
cannot be created, raises the last error if *all_errors* is False,
and an ExceptionGroup of all errors if *all_errors* is True.
"""
host, port = address
exceptions = []
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
af, socktype, proto, canonname, sa = res
sock = None
try:
sock = socket(af, socktype, proto)
if timeout is not _GLOBAL_DEFAULT_TIMEOUT:
sock.settimeout(timeout)
if source_address:
sock.bind(source_address)
sock.connect(sa)
# Break explicitly a reference cycle
exceptions.clear()
return sock

If socket.create_connection() is called with the kwarg source_adddress, the invalid port number will be passed to socket.bind() instead of the truncated port from getaddrinfo(), and a proper exception is raised:

invalid_address = ('127.0.0.1', 65535 + 1)
socket.create_connection(invalid_address, source_address=invalid_address)
Traceback (most recent call last):
  File "~/ascore.py", line 34, in <module>
    socket.create_connection(invalid_address, source_address=invalid_address)
  File "/usr/lib64/python3.11/socket.py", line 835, in create_connection
    sock.bind(source_address)
OverflowError: bind(): port must be 0-65535.

In the case of asyncio.get_event_loop().create_server(), it seems like the function eventually calls socket.getaddrinfo().

Calling socket.bind() or socket.connect() directly with an invalid port results in an OverflowError being raised, and it looks like there was some effort in socket.create_connection() to handle these exceptions.

I tested the following scenarios:

import asyncio
import asyncore
import socket

async def asyncore_test(address):
  async_server =  asyncore.dispatcher()
  async_server.create_socket(socket.AF_INET, socket.SOCK_STREAM)

  async_server.connect(address)   # OverflowError: connect_ex(): port must be 0-65535.
  async_server.bind(address)      # OverflowError: bind(): port must be 0-65535.

async def asyncio_test(address):
  # OverflowError: connect(): port must be 0-65535
  await asyncio.get_event_loop().create_connection(asyncio.Protocol, *address)
  await asyncio.open_connection(*address)
  await asyncio.get_event_loop().create_server(asyncio.Protocol, *address)

  # port will overflow if host is None and port is specified
  await asyncio.get_event_loop().create_server(asyncio.Protocol, port=address[1])

def socket_test(address):
  s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

  # OverflowError: connect(): port must be 0-65535.
  s.bind(address)
  s.connect(address)

  # port will overflow
  socket.create_connection(address)   

async def main():
  invalid_address = ('127.0.0.1', 65536 + 10000)
  await asyncore_test(invalid_address)
  await asyncio_test(invalid_address)
  await socket_test(invalid_address)

if __name__ == "__main__":
  asyncio.run(main())

This issue seems to already have been discussed (#74895, #2435, #74896, #2436), but I still feel like something should be done.
According to RFC 6335, a valid service name MUST contain at least 1 letter [A-Za-z], which means input validation can be done before calling the system's getaddrinfo(), which on my system just truncates the port to an unsigned short (original bug report). In my example, the inconsistency between using the kwarg port vs. not using it caused a bug. I feel that the behaviour should be the same across all function calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants