Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS ABC PEP draft #1

Open
wants to merge 73 commits into
base: master
Choose a base branch
from
Open

TLS ABC PEP draft #1

wants to merge 73 commits into from

Conversation

Lukasa
Copy link
Owner

@Lukasa Lukasa commented Oct 17, 2016

This pull request tracks my work on the PEP for the TLS ABCs. This PEP is focused on defining a limited interface that can be implemented by any TLS backend for Python, allowing Python network implementations that don't have specific low-level requirements to obtain support for multiple TLS backends.

I've put this up as a GitHub pull request to allow anyone who is interested to provide feedback before I post this PEP to python-dev. My intended flow for this PEP is as follows:

  1. Review by individuals (on this PR).
  2. Submit to the Python Security SIG for review.
  3. Submit to Python-Dev for review.
  4. Merge
  5. Implement (!)

Anyone who likes may leave inline comments on this diff. Shout if you'd like to be able to use GitHub's review feature and I can add you as a collab on this repository to make it a bit easier.

@Lukasa Lukasa changed the title SSL ABC PEP draft TLS ABC PEP draft Oct 17, 2016
pep-0xxx.rst Outdated
class in the ``ssl`` module.

While it is technically possible to define (2) in terms of (3), for the sake of
simplicity it is easier to define these as two separate ABCs. Impelementations
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo “Impelementations”

pep-0xxx.rst Outdated
class Context(metaclass=ABCMeta):
@abstractmethod
def register_certificates(self,
certificates: str,
Copy link

@hynek hynek Oct 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When designing a new API, we should take into account, that people might want to use TLS libs with HSMs and avoid loading keys into Python’s memory.

pep-0xxx.rst Outdated
TLSBufferObject = Union[TLSWrappedSocket, TLSWrappedBuffer]
ServerNameCallback = Callable[[TLSBufferObject, Optional[str], Context], Any]

class Context(metaclass=ABCMeta):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is everyone 100% sure we should have only one Context class? Or would it be better to have a ClientContext and ServerContext and make them same for the time being? I don’t really know any other TLS API than OpenSSL but it seems thinkable that a lib might exist that makes a distinction? Also makes docstrings less weird and more concrete.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely happy to have a client context and a server context. It's not clear to me that their APIs would differ at all though. If that's the case, I'm not sure how much value there is in doing it. But I'd like to see additional opinions here.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, actually, there's another good reason to want to do this: some TLS implementations may only want to offer one side of that connection. Like, SecureTransport can technically be a server, but is missing a whole bunch of hooks and functions to be able to do it effectively. So a SecureTransport-based implementation may just want to provide ClientContext, to signal this limitation.

pep-0xxx.rst Outdated
both of ALPN or NPN.

``protocols`` should be a list of ASCII strings for ALPN tokens,
such as ``['h2', 'http/1.1']``, ordered by preference. The
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know a lot about ALPN but strings are terrible interfaces.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ALPN is stringy. There's really no other way around it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn’t mean our API has to be stringy though, does it? ISTM it’s about a few well-known strings?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ALPN registry can be regularly updated. I'm really reluctant to put us in a position where we have a well-typed API that requires CPython to keep up with an ICANN registry of tokens.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about having one generic type that eats a string? Like:

protocols = [H2, HTTP_1_1, Custom("whatever")]

Having proper objects for well-known protocols seems like a good way to avoid (typo) errors and avoiding normalization (like 'h2 ' or 'H2').

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get behind that I suppose.

pep-0xxx.rst Outdated
pass

@abstractmethod
def set_ciphers(self, ciphers: List[Ciphers]) -> None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFTR, we have a fairly generic solution to this in Twisted you may or may not have a look at.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so my suspicion is that there are two ways to get generic support here. Probably the easiest, honestly, is to have the list of Ciphers be basically a fancy enumerated type that uses the full, formal names for cipher suites like the ones provided here. Those names should map to their TLS integer, and can then be de-mapped as needed by the individual implementations.

How does that sound?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a database of all known cipher suites, their hex ids and aliases for different TLS libs, https://raw.githubusercontent.com/tiran/tlsdb/master/tlsdb.json

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should use IANA/RFC as the canonical one and offer easy ways to translate including a way to extend so people aren’t ever bound to what’s inside?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be my guess. Not sure what "easy way to extend" means though: some kind of wacky custom enum?

pep-0xxx.rst Outdated
until more data is available in the input buffer.
"""
pass

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this all the errors we need? Any reason why there’s no validation errors? I’d like a whole hierarchy of ValidationError > ExpiredError/ServiceIdentityError/…

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I haven't defined a more complete exception hierarchy is because I'm reluctant to constrain what applications can provide. We could certainly provide an abstract validation error though, and then allow implementations to provide more specific subclasses if they are able.

Does that sound sensible?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to have one common base class for SSL exceptions. In the future ssl.CertificateError should be come a subclass of ssl.SSLError.

pep-0xxx.rst Outdated
* It's annoying that there's no type signature for fileobj. Do I really have to
define one as part of this PEP? Otherwise, how do I define the types of the
arguments to ``wrap_buffers``?
* Do we need ways to disable specific TLS versions?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought so.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenSSL 1.1.0 has deprecated all version specific protocols. For Python stdlib I like to move to PROTOCOL_TLS_CLIENT (client side onky method) and PROTOCOL_TLS_SERVER (server side only method) with auto-negotiation of the highest protocol number.

To set minimal and maximum TLS version I like to introduce a set_version_range(self, min=None, max=None) function. None means minimum supported resp. maximum supported version.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so I have no intention of exposing OpenSSL's weird handshake version thing. I literally mean a function call that would translate to setting OP_NO_*. Your set_version_range would likely be fine.

pep-0xxx.rst Outdated
arguments to ``wrap_buffers``?
* Do we need ways to disable specific TLS versions?
* Do we need ways to control hostname validation?
* Do we need ways to disable certificate validation altogether?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m afraid so. It’s impracticable to do otherwise. Even I had to disable it in one case in production because third parties.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. To do this I have to define some level of optionality on this feature, because some TLS stacks may not be able to have their validation disabled (at least not easily).

pep-0xxx.rst Outdated
define one as part of this PEP? Otherwise, how do I define the types of the
arguments to ``wrap_buffers``?
* Do we need ways to disable specific TLS versions?
* Do we need ways to control hostname validation?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I’ve encountered the wildest things in practice. One EPP registry for example uses a service name instead of host name.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so when we say "control hostname validation", what kind of control were you wanting? What kind of interface would that look like?

For example, SecureTransport can only do both SNI and hostname validation, or neither SNI nor hostname validation. So would it be sufficient to have a binary "verify hostname" property or switch, and then a way to extract relevant data from the cert? Or just the cert in DER form?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess. Maybe a callback even?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A callback might be a bit tricky, but I suppose it's the best way to handle it.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we had a callback, does it take an x509 certificate of some kind? A list of SAN fields?

pep-0xxx.rst Outdated
* Do we need ways to disable certificate validation altogether?
* Do we need to support getpeercert? Should we always return DER instead of the
weird semi-structured thing?
* How do we load certs from locations on disk?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a subproblem I’ve mentioned before about HSMs. we need a generic solution to this. I hope someone who actually uses HSMs can help us here out.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think the various HSM questions here I shall defer to either @tiran, @alex, or @reaperhulk, all of whom have vastly more experience with HSMs than I do (which is to say, any at all).

"platform-native" experiences.


Problems
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section to highlight the fact that the ssl module is not an abstract API for implementing TLS (heh), but rather exposes OpenSSL implementation details as it's own.

Proposal
========

This PEP proposes to introduce a few new Abstract Base Classes in Python 3.7 to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And 2.7.x!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you probably want to indicate that the ssl module will implement these intefaces :-)

pep-0xxx.rst Outdated
TLSBufferObject = Union[TLSWrappedSocket, TLSWrappedBuffer]
ServerNameCallback = Callable[[TLSBufferObject, Optional[str], Context], Any]

class _BaseContext(metaclass=ABCMeta):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't subclass things, it's bad for your health.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API doesn't contain anything about session tickets/resumption or 0-RTT handshakes for TLS 1.3

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to force that into the API? Or do we want to say that concrete implementations should wrap those details into their ClientContext or ServerContext objects?

pep-0xxx.rst Outdated

class _BaseContext(metaclass=ABCMeta):

@property
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really just be a boolean property? I feel like if one was really approaching this API from scratch, a TLS client/server would take a PeerCertificateValidator, which is it's own interface, which could be implemented in many different ways.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a good example of how one would do that with OpenSSL, or any idea of what that interface would be like? I don't have a good handle on what your mental picture of this interface looks like right now.

pep-0xxx.rst Outdated
pass

@abstractmethod
def register_certificates(self,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Register doesn't seem like a great name to me.

pep-0xxx.rst Outdated
def register_certificates(self,
certificates: str,
key=None: Optional[str],
password=None: Optional[Callable[[], Union[AnyStr, bytearray]]]) -> None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these strings instead of objects?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The certificates? Mostly so I could avoid defining an X509Certificate object that I would then have to define an interface to. I suppose I could bootstrap an interface like that from cryptography.x509. Accepting your bias regarding that interface for a moment (:wink:), does that sound like a better approach?

pep-0xxx.rst Outdated
"""

@abstractmethod
def wrap_buffers(self, incoming: Any, outgoing: Any,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incoming and outgoing can't really be Any, can they?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a good type definition for them, and IIRC the typing module doesn't yet have good support for interface-based typing.

pep-0xxx.rst Outdated

class ClientContext(metaclass=ABCMeta):
@abstractmethod
def wrap_socket(self, socket: socket.socket,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these APIs implement socket-specific bits at all? It seems like they could be written purely in terms of buffers, and then there could be a single implementation, shared across all TLS implementations, which turns the buffer-API into one that handles sockets.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's unquestionably true. I suppose the main reason was to allow for the possibility that implementations may have their own socket-specific logic that they'd like to use. But I'm not 100% convinced there is value there.

pep-0xxx.rst Outdated

@context.setter
@abstractmethod
def context(self, value: Context) -> None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a bad API. It's clearly inherited from OpenSSL's approach to SNI (IIRC), but it should not exist in a platonic ideal TLS API.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context-swapping API? Yeah, I can get behind removing that. I guess the question becomes: how do you request a change to the TLS settings based on the SNI parameter? Should you be able to do that?

pep-0xxx.rst Outdated

The definitions of the errors are below::

class TLSError(Exception):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a very imprecise way to specify errors.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it is. But I was worried about getting too specific here and constraining individual implementations. In general I'd suggest we only want to break out errors for things that applications could meaningfully recover from: otherwise they're all equally fatal.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related discussion happened here: python-tls/tls#112 (review)

pep-0xxx.rst Outdated
"""

@abstractmethod
def set_ciphers(self, ciphers: List[Ciphers]) -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SChannel's cipher configuration is a bit weird in a way that I don't think will work with an explicit list (see palgSupportedAlgs here).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll still be fine: the SChannel implementation will just need to transform the list into the appropriate "pointer to array".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that I'm not sure what the appropriate transform would even be. The ALG_ID doesn't identify a cipher, but rather individual algorithms. You can turn on or off ECDHE or SHA 256 for example, but not specifically TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh god I hadn't noticed that.

I don't think we can come up with a general interface to this without requiring that the SChannel backend work out how to make this work. It should be fairly possible to work out the union of algorithms that can be used for most cipher strings, I think.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some thoughts written down on a possible way to support some kind of cipher filtering in a similar project: sfackler/rust-native-tls#4 (comment).

The usefulness does seem to be somewhat limited compared to full cipher selection, but it might still be worth it to be able to do things like disable non-ephemeral key exchange, for example.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, so I'd say the alternative here is the same as you outline in your linked issue: construct this enum like SChannel and have OpenSSL and friends do the mapping instead. I think I'm not sure that moving to the lowest-common-denominator here is a good idea, especially given the inertia of what is already present in the system, but I'm prepared to be convinced otherwise.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Windows's APIs do have the ability to explicitly enumerate cipher suites, but it looks like the configuration is global: https://msdn.microsoft.com/en-us/library/windows/desktop/bb870930(v=vs.85).aspx

attempt to export their system trust stores in some form.

Relying on `certifi`_ is less than ideal, as most system administrators do
not expect to receive security-critical software updates from PyPI.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😟

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😟 indeed.

pep-0xxx.rst Outdated
continue. Other return values signal errors. Attempting to control
what error is signaled by the underlying TLS implementation is not
specified in this API, but is up to the concrete implementation to
handle.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the intention here is that the callback does the same sslbuffer.context switcheroo as we see in the ssl module? If so that should be made explicit (it's not obvious that an implementation needs to support changing SSLContext objects midstream!). I do wonder if there's a better option, also (is there any case where the new context differs in any way besides which certificate it presents? could we just like... return the certificate instead?).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if you're splitting the context types than this should be part of ServerContext, not _BaseContext.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...it looks like Go used to only support switching certificates in their equivalent of this callback, but in the next release is adding a new callback that allows arbitrary configuration to be changed. So... that seems like strong evidence that I'm wrong, and arbitrary reconfiguration is actually important.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is a really good question. We definitely want to support at least swapping SSLContext object mid-stream, but it's worth considering whether a more explicit API would be better. I'm genuinely not sure.

pep-0xxx.rst Outdated
@abstractmethod
def wrap_socket(self, socket: socket.socket, server_side=False: bool,
auto_handshake=True: bool,
server_hostname=None: Optional[str]) -> TLSWrappedSocket:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared to the stdlib module, this drops suppress_ragged_eofs and session arguments. The whole suppress_ragged_eofs thing is a bit weird, but I guess should either be supported or else it should be explicit that this API acts like it was False. session seems important; I'm guessing it just got missed b/c the stdlib only added it in 3.6?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I guess you should decide whether you want ServerContext/ClientContext to be split off, b/c right now you have inconsistent definitions of wrap_socket and wrap_buffers in the same document :-).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real question is whether we need to doff our cap at the ssl module at all. If it didn't exist, would you care that suppress_ragged_eofs wasn't on this function?

As to session, right now I don't think that users should be in charge of session reuse. I think that error exists in the ssl module API due to historical quirks, and we shouldn't replicate it here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the suppress_ragged_eofs thing is clearly not a necessary feature. But in stdlib ssl, the default is to ignore unclean shutdowns (suppress_ragged_eofs=True), which is a security problem in some cases. So we should at the least provide an option to act like suppress_ragged_eofs=False. The other option is to only act like suppress_ragged_eofs=False. This means everyone implementing common protocols like HTTPS will need to catch and ignore the ragged EOF exceptions, and complicate porting from stdlib ssl (people are probably depending on suppress_ragged_eofs=True without realizing it). So... either an option or only-False seem reasonable to me, but either way I think the spec needs to explicitly say what it's doing :-).

Regarding session: I thought it needs to be exposed because e.g. servers may need to stick the session ticket in a database? I guess the alternative you're proposing is that each TLS implementation would hard-code some in-memory LRU ticket cache, maybe with some extension to tune the cache size etc.? This is beyond my TLS knowledge depth, just wanted to make sure the issue didn't get missed :-)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... either an option or only-False seem reasonable to me, but either way I think the spec needs to explicitly say what it's doing :-).

I remain unconvinced of the need to do this up here. I can see wanting to provide an implementers note, but ultimately this seems like a really weird note to give. It's essentially saying: "be aware that the connection may be terminated in an unclean manner, and that may be treated as an error condition". This document really shouldn't become a "this is how TLS works" primer.

We absolutely should consider adding a section ("Porting Notes"?) that covers gotchas that may be encountered when moving from the stdlib ssl module to this one. But I don't think annotating each method that differs from stdlib ssl in any minor way about exactly how it differs is a sensible way of presenting the documentation of the interface.

For session, the big risk here is that in an abstract API persisting session information is a very tricky thing to do. This is because it is clearly not enough to persist an abstract session ID, as that would provide little value across different processes. You need to persist all relevant information about the Session, and what exactly that data is and how it gets turned into a Session that a concrete implementation can restore is extremely esoteric. Essentially, if you're using the generic API, the idea of doing cross-process session caching is something you need to give up on, because you simply cannot meaningfully assume that your OpenSSL-based session can be quietly inserted into a GnuTLS-based process.

This is a problem for concrete APIs, and they can address it directly in their own implementations.

pep-0xxx.rst Outdated
no behaviour. They exist simply to signal certain common behaviours. Backends
should subclass these exceptions in their own packages, but needn't define any
behaviour for them. These exceptions should *never* be thrown directly, they
should always be subclassed.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...Why is throwing them directly a problem? I mean, WantWrite is WantWrite, there's not much more to say...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer is because I'd like people to behave like these classes are abstract, even though they aren't. =P

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such enterprise, wow image

(But seriously, in order to be caught these exceptions will have to live in some concrete well-known package and implementations will have to use exceptions that inherit from that well-known package's concrete implementations, which is all totally fine but I don't see much benefit in pretending they're abstract?)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair enough. I'd say the other reason to encourage the throwing of subclasses is to clarify what package is in use when an error occurs, but we can change the prohibition to a strong encouragement to do that instead.

@njsmith
Copy link

njsmith commented Jan 11, 2017

Am I correct that as currently written, this API doesn't have any way to do clean shutdown, so it can't be used for protocols that are vulnerable to truncation attacks? (In the stdlib IIUC the unwrap method does this, which is pretty confusing and poorly documented but at least it exists.)

@Lukasa
Copy link
Owner Author

Lukasa commented Jan 12, 2017

@njsmith That's correct. Probably worth adding an optional abstract method to allow unwrap.

pep-0xxx.rst Outdated

class TLSWrappedBuffer(metaclass=ABCMeta):
@abstractmethod
def read(self, len=1024: int, buffer=None: Any) -> Union[bytes, int]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of feels a bit weird to have this sort of change based upon a buffer or not. How about two functions?

class TLSWrappedBuffer(metaclass=ABCMeta):
    @abstractmethod
    def read(self, len=1024: int) -> bytes:
        pass

    @abstractmethod
    def read_into(self, buffer: Any, len=1024: int) -> int
        pass

Also I don't like the len name here since it shadows len().

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two functions is fine.

"""

@abstractmethod
def shutdown(self) -> None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is to take place of the write_eof() function in MemoryBio, but my question then becomes should we also expose a is_shutdown boolean?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. What value would it provide?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPython's ssl module has a shutdown method with a different meaning. It is used to shut down a connection on socket level, too.

https://docs.python.org/3/library/ssl.html#ssl-sockets
https://docs.python.org/3/library/socket.html#socket.socket.shutdown

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I was just looking at the current MemoryBio implementation it had a property to determine if the buffer has shutdown or not. I've never actually used the memory bio so I don't know, but presumably it'd be useful to prevent getting an exception if you've already closed the buffer or as a signal for other things that this is finished. I could imagine something like:

tls_buffer = ...
while not tls_buffer.is_shutdown:
    data_to_write = a_write_queue.get(block=True)
    if data_to_write is AShutdownSigil:
        tls_buffer.shutdown()
    else:
        tls_buffer.write(data_to_write)

There are other ways to write this of course (a while True loop with a break comes to mind) but it feels useful to me?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPython's ssl module has a shutdown method with a different meaning.

True. I should note that this shutdown isn't on the socket object, it's on the buffer object. The socket object does not define a shutdown. That said, I can see that there is potential for confusion here.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other ways to write this of course (a while True loop with a break comes to mind) but it feels useful to me?

Useful, sure. But ultimately, I want to balance that utility against the size of this API. Each extra thing that goes into it is more code that we have to convince people to write and support. I'm not sure the utility of this justifies adding it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! It confused me already. :)

To increase general level of confusion, let's not forget that ssl.SSLObject implements unwrap to call shutdown of _ssl._SSLSocket, which internally calls SSL_shutdown().

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_eof is probably a slightly better name?

Also, this is effectively a variant of write, so it should be documented to potentially raise the same exceptions, I think?

pep-0xxx.rst Outdated

::

NextProtocol = namedtuple('NextProtocol', ['name'])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should be an enum.Enum, like:

import enum

class NextProtocol(enum.Enum):
    H2 = b"h2"
    H2C = b"h2c"
    HTTP1 = b"http/1.1"
    WEBRTC = b"webrtc"
    C_WEBRTC = b"c-webrtc"
    FTP = b"ftp"
    STUN = b"stun.nat-discovery"
    TURN = b"stun.turn"

I would still allow passing arbitrary bytes though.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, can't hurt.

pep-0xxx.rst Outdated
::

class TLSVersion(Enum):
MINIMUM_SUPPORTED
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is a bit of a footgun. Maybe it would make sense to also add a SECURE_MINIMUM and have that be the default value for TLS version minimum versions?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very nervous about doing that. Adding anything with "secure" in the name basically forces python-dev into a position where they are once again tasked with making security-critical updates to a package. I would vastly prefer that this module didn't even pretend that it was making security decisions for you, such that we don't have to sell the idea of regularly updating the module.

We can still have a SECURE_MINIMUM that is the default, it just shouldn't be the exposed. While I'm here I should note that create_default_context could easily operate on the abstract context type, which would give us a way to enforce this in one place.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that pretending that python-dev isn't going to be in a position where they are tasked with making security-critical updates to a package is the same kind of short-sightedness that lead to the ssl module not getting updated for new security considerations ever because it was possible to use it securely. We're going to have a default minimum and it's (hopefully) going to be a "secure" minimum so we're going to need to update it regardless of if we expose it or not. All exposing it does is allow someone to revert back to a secure minimum without having to hardcode that in their application.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely opposed to publicly exposing a "secure minimum" because the definition of secure is too flexible with regard to TLS.

However, what's the scenario where a user would legitimately want to ask for MINIMUM_SUPPORTED without being capable of typing TLSv1 (or, maybe for evil fuzzing purposes, SSLv3) themselves?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I can probably warm up to removing MINIMUM_SUPPORTED. It's mostly just there for symmetry right now.

pep-0xxx.rst Outdated
"""
Creates a Certificate object from a byte buffer. This byte buffer
may be either PEM-encoded or DER-encoded. If the buffer is PEM
encoded it *must* begin with the standard PEM preamble (a series of
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the fairly common case where you have multiple certificates and a private key in the same file? In this case the private key might be first.

pep-0xxx.rst Outdated
@abstractclassmethod
def from_buffer(self,
buffer: bytes,
password=None: Optional[Union[Callable[[], Union[AnyStr, bytearray]], AnyStr, bytearray]) -> Certificate:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant to have this return a PrivateKey?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup.

pep-0xxx.rst Outdated
"""
Creates a PrivateKey object from a byte buffer. This byte buffer
may be either PEM-encoded or DER-encoded. If the buffer is PEM
encoded it *must* begin with the standard PEM preamble (a series of
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above, what happens in the common case of certificates + private key in the same file, the certificate might come first?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I'm not honestly sure. I'm not even honestly sure that we care that much.

The risk is that simply defining a function that takes a path/buffer and returns a list of Certificate and a PrivateKey forces our Certificate and PrivateKey objects to become concretized: that is, they need to be backed by actual X509 objects, because they can't just all point at the same file anymore. This hurts some possible optimisations (e.g. having Certificate objects actually just be a wrapper to the path to the cert for OpenSSL).

Ultimately, I think this "common case" is actually an OpenSSL special (I don't know that other backends support this, and some definitely don't e.g. SecureTransport), so I'm inclined to say that it doesn't belong in the generic implementation.

pep-0xxx.rst Outdated
private key. It will only be called if the private key is encrypted
and a password is necessary. It will be called with no arguments,
and it should return a string, bytes, or bytearray. If the return
value is a string it will be encoded as UTF-8 before using it to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels kind of wrong to me. Should we just mandates a bytes or bytearray? If the calling code has a str it's pretty trivial for them to do a value.encode("utf8") on their own without special casing utf8 here (which could possibly do the wrong thing if someone meant to say, encode it in shift-jis or so).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure mode for getting this wrong here is really obvious and causes no threat. I don't think it hurts us to just tolerate this easy mistake.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the failure mode is entirely non-obvious. It'll get exposed as your password not working and I think the typical end user is going to first double check they typed it correctly then they're going to try it in other applications and see it works there. I don't think it's obvious at all that they accidentally passed a str instead of a bytes.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're going to try it in other applications that will encode it using SHIFT-JIS? Really? What application is that?

Are there in fact any applications that encode a password in any encoding other than UTF-8, Latin-1, or ASCII?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption is that openssl isn't going to do any encoding at all and is going to just pull bytes off of stdin (or whatever). If their system is configured to use SHIFT-JIS then that will work, and Python will happily read() that into a str by default.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I mean the openssl CLI, but I assume the C api doesn't either.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the error case here is that the user has a non-UTF8 or ASCII locale, has typed a password using non-BMP characters, and then uses sys.stdin.read() or the equivalent?

I am extremely unsure that that error case will ever come up, but I guess it's not literally impossible either.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa As I understand, and I could be wrong, using Shift-JIS is popular in Japan and the users who use it tend to have their locales setup for it. It doesn't seem that far out of line to assume that someone in Japan, using a Shift-JIS locale might use a Japanese letter in their password. For example:

PASSWORD = "学校"

assert PASSWORD.encode("utf8") == PASSWORD.encode("shift-jis")

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok, I've already conceded this point. ;) It's in my task tracker as a TODO to remove this.

pep-0xxx.rst Outdated
@abstractclassmethod
def from_file(self,
path: Union[pathlib.Path, bytes, str],
password=None: Optional[Union[Callable[[], Union[AnyStr, bytearray]], AnyStr, bytearray]) -> Certificate:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I think you meant to return a PrivateKey here.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup.

pep-0xxx.rst Outdated
Cipher Suites
~~~~~~~~~~~~~

Todo
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two ideas:

The OpenSSL cipher spec is not entirely unreasonable here, maybe we could make a few objects that basically implement that, but in a typed way. So that instead of:

ciphers = "ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS"

You would get

ciphers = (
    (
        ((Ciphers.ECDH + Ciphers.AESGCM) & (Ciphers.DH + Ciphers.AESGCM)) |
        ((Ciphers.ECDH + Ciphers.ChaCha20) & (Ciphers.DH + Ciphers.ChaCha20))
    ) &
    (Ciphers.ECDH + Ciphers.AES256) &
    (Ciphers.DH + Ciphers.AES256) &
    (Ciphers.ECDH + Ciphers.AES128) &
    (Ciphers.DH + Ciphers.AES) &
    (Ciphers.RSA + Ciphers.AESGCM) &
    (Ciphers.RSA + Ciphers.AES) &
    ~Ciphers.aNULL &
    ~Ciphers.eNULL &
    ~Ciphers.MD5 &
    ~Ciphers.DSS
)

The above has a quick sketch of the possibility to define "equivalent" ciphers through the use of grouping and the | statement. Obviously if the implementation doesn't support "equivalent" (do any of them? I know OpenSSL doesn't but I really wish it did...) they could just treat this as a flat list.

The other idea I had is using something like enum.Flag.

@njsmith
Copy link

njsmith commented Feb 6, 2017

...Also, I looked into NSS a little bit, and it seems like it could support buffer-mode just fine? It wants to work on a PRFileDesc object, which is an OO abstraction over a file, so if someone wanted to implement a BytesIO-like in-memory PRFileDesc then NSS could happily use it.

In fact AFAICT you have to do something like this anyway if you want to implement the TLSWrappedSocket abstraction, because when working with a non-blocking socket, the NSS send/recv functions just return a generic EWOULDBLOCK equivalent, without specifying whether this is a WantRead or WantWrite. The only way I can see to extract this information is to implement a custom PRFileDesc that sits in between the NSS layer and the real socket and keeps track of whether the last socket operation NSS attempted was to recv or to send. (Example)

@Lukasa
Copy link
Owner Author

Lukasa commented Feb 6, 2017

I mean, I think the difficulties are pretty minor. It's a trivial runtime check in the library/application: if backend.wrapped_buffer is None: raise WhateverError("This TLS backend does not support wrapped buffers").

I am not proposing that we wouldn't implement the wrapped buffers for SecureTransport, OpenSSL, SChannel and friends. I am just proposing that we wouldn't go so far in this PEP as to say "and here's a concrete TLSWrappedSocket object that operates in terms of TLSWrappedBuffer". We could unquestionably provide such code in supporting documentation, and provide encouragement to users to implement in that way, but it does not necessarily seem prudent to assume that we want to provide a concrete implementation in the standard library. Though again, I'm willing to be talked around on this issue.

As to the keeping track of NSS thing, that's not an enormous surprise: I'll have to do the same for SecureTransport.

@njsmith
Copy link

njsmith commented Feb 6, 2017

OK, right, for me the potential value-add would be if we could drop TLSWrappedSocket from the PEP altogether, and shift it into the stdlib as one particular user of the API. The motivation here is that generally when you have an API that has to be implemented multiple times, then it's good to reduce the surface area of that API as much as possible, to minimize the cost of implementing it (since this gets multiplied by the number of implementations) and minimize the number of interoperability bugs. I mean, stdlib code is not particularly malleable either, but it's still better than a multiply-implemented specification written in English in a PEP.

TLSWrappedSocket has a huge surface area – the PEP currently glosses over it, but all of the semantics of socket are incorporated by reference, which makes it substantially larger than the entire rest of the spec, and very poorly specified. The edge cases around socket objects are incredibly complicated.

Go here and search for the string "changed in", then consider that if this PEP had been active at the time, then each of those lines represents a separate change that you would have to evaluate, and then potentially update the PEP and coordinate with all implementations to get the changes in... (Concrete example: currently socket objects have a relative timeout the applies to each call individually. Golang sockets have a way better idea, which is to have an absolute deadline after which calls start timing out. Python really should steal this feature. But this PEP as currently written would make it substantially harder to improve the socket API in the future.)

Certainly providing a piece of common code for implementations to use could potentially help mitigate this. But there's still a huge difference in principle between "a piece of code in the stdlib maintained by python-dev and updated on their cycle" and "a formal specification with multiple implementations across different projects, some of which might happen to share some code". Writing a good spec is hard, or put another way, specs are a problem. Sometimes they're a necessary evil, but we should try to make them as small as possible :-).

OTOH, as far as I know the only advantages we get from putting TLSWrappedSocket in the spec are (a) that it might allow us to support some TLS implementations that can't handle the buffer interface and (b) it might have some performance benefits. The counter-argument to (a) is that since NSS appears to allow buffers and kTLS doesn't exactly exist, there aren't any actual examples of such implementations to point to? And the counter-argument to (b) is that it's a lot easier to add optimizations later than to remove premature optimizations :-).

I get the feeling that you perceive some additional advantages to TLSWrappedSocket that I'm not understanding, which is fine – I often miss things :-), and I'm sure you'll make a good decision. So all that said, the one thing I absolutely insist on happening is that the PEP gain some rationale section laying out these different options/trade-offs and explaining the reason why you ultimately decided on whatever you decide on.


As at any time a re-negotiation is possible, a call to
``readinto()`` can also cause write operations.
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of minimizing spec surface area – read and readinto seem redundant, since read can be straightforwardly implemented in terms of readinto. Is there some significant performance advantage to supporting both, or would it make sense to drop one?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perversely, readinto is not always trivial to implement in a useful way. In particular, CFFI frequently struggles with doing readinto because it is often reluctant to allow you to pass a buffer object into C code thanks to PyPy's GC. With many CFFI tools, the reverse is what's done: readinto is implemented in terms of read.

And given that it is really easy to write both by just writing one and then implementing the other in terms of it, there is basically zero cost in specifying that the implementation must provide both, rather than requiring every implementer in the world who wants read instead of readinto to write their read function. =)

@Lukasa
Copy link
Owner Author

Lukasa commented Feb 6, 2017

So I can totally understand the desire to want to write a single TLSWrappedSocket implementation. My biggest concerns here are that in some cases we may be introducing dramatically more complexity to the stack by requiring support for buffers rather than allowing objects to just support sockets if they want to.

Returning to NSS, while I believe it is possible to write an NSS implementation that works on buffers, I'd be disinclined to call it "easy" until I've seen one. However, it's always going to be easier to write an NSS implementation that wraps a socket because that's the default mode in which it is deployed by almost all applications.

The same is true for SecureTransport: while it's definitely possible to implement it in terms of wrapped buffers, it's not trivial, in part because ST has a few assumptions predicated on the idea that the read/write callbacks will talk directly to FDs instead of to buffers. All of these can be coped with, of course, but they're non-trivial.

Regardless, I'm going to keep tweaking this ST prototype I'm working on to see if we can get to a good place with the wrapped socket to use as a discussion point.

@glyph
Copy link

glyph commented Feb 6, 2017

I'm not sure which way I'm going to advocate here, but, one question is:

Is providing a sans-IO layer a required part of compliance this specification, or only optional?

If blocking is required but sans-IO is optional, then that's explicitly relegating asynchronous I/O to a second class citizen. (Which, granted, libnss has totally done; but, as I understand it, that's one of the things that derailed various attempts to get rid of OpenSSL in favor of it).

If sans-IO is required, then what's the point in doubling implementation effort for every provider? For many backends, the "wrap a socket" library is the more heavily exercised code path, and may expose a superficially more convenient API; however, if an in-memory implementation is possible, then eschewing the native (potentially buggier, edge-case unfriendly, needlessly different in behavior) native implementations and having one common Python one seems like the way to go.

@Lukasa
Copy link
Owner Author

Lukasa commented Feb 6, 2017

Neither is required: both are optional. It wouldn't be very helpful to provide neither, of course. ;) In practice, I think the stdlib solutions would all choose to do both.

@Lukasa
Copy link
Owner Author

Lukasa commented Feb 7, 2017

Ok, so here is what I think is my suggestion for a compromise position.

Why don't we keep the ABC for the socket, but also provide a concrete implementation that backends can use if they see fit? That way backends that want to do their own thing are free to do so, but backends that are happy to have their I/O implemented in terms of a Python socket object can get that done as well. I think I've convinced myself that I'm happy with the ability to write such a socket by playing about with SecureTransport, so I'm open to moving forward with this PEP by simply noting that such an implementation will be provided as a part of the implementation of this PEP.

The major people who have expressed an interest in this are @glyph, @njsmith, and @tiran. How does that compromise position sit with the three of you?

@glyph
Copy link

glyph commented Feb 7, 2017

That's what I thought the original position was :).

I'll try to clarify my own position: I think this is a great opportunity to send the message that it's not the TLS library's job to do I/O, that this is the sort of task best delegated to a high-level language rather than C, and avoid a bunch of potential bugs in the process.

Synchronous I/O should be the second class citizen, not because I prefer asynchronous (surprise, I do) but because synchronous I/O is inherently an abstraction on top of a sans-IO API.

However, I understand that this message is somewhat "against the grain", that some libraries might be insufficient to the task.

Worth raising at this point, I think, is s2n. While it is my understanding that PRFileDesc has all the various callbacks that one would need to turn libnss inside-out, s2n has no notion of a sans-IO layer at all. This is a bug that they're already aware of, but if someone wanted to make an s2n backend today they would be unable to do so without an ABC defining the blocking interface.

My perspective is that the clear message should be "s2n is insufficient for a python backend at this time, get on fixing 358 and when it has an abstraction layer it can be integrated". The key question, though, is whether this work wants to be prescriptive or descriptive: if it's designed to record the state of the art without too much editorializing, then providing both ABCs is necessary. If it's an opportunity to promote better design (which, I should stress, almost all TLS libraries are already capable of achieving even if it's the road less traveled) then we should only provide the one.

@njsmith
Copy link

njsmith commented Feb 8, 2017

My biggest concerns here are that in some cases we may be introducing dramatically more complexity to the stack by requiring support for buffers rather than allowing objects to just support sockets if they want to.

I think I see where you're coming from here, but I've been trying to think through the cases and I'm not convinced that it's true.

So first, there are cases like s2n (thanks @glyph!) that must do their own I/O. If we require support for buffers, then these libraries simply won't be supported by the standard Python TLS interface. This is somewhat unfortunate, but it definitely doesn't increase the complexity of anything – libraries that don't exist are always simpler than libraries that do :-). And I agree with glyph that there's no need to be held hostage by such libraries; the vast majority of TLS implementations do support buffers, including all the ones we want in the stdlib, and it sounds like s2n is planning to add this soon anyway.

Next, there are cases like SecureTransport where you're saying that supporting buffers is non-trivial but doable. The SecureTransport backend has to support the sans-IO API though, because otherwise Twisted can't do TLS on MacOS – that's a feature-driven decision, it's irrelevant what the PEP says. So really the two options here are: either SecureTransport supports the buffer API and then uses generic code to convert that into a wrapped socket, or else SecureTransport supports the buffer API and also has its own private wrap_socket implementation. The latter option obviously has higher complexity, given that it involves redundant implementations of the same thing. (In practice I assume you aren't actually planning to implement a private socket wrapper in your SecureTransport code, but rather re-use the generic one?) So here, again, allowing backends to wrap sockets directly doesn't help reduce complexity at all.

Finally, there are cases like NSS: it probably could support buffers, but if it's supported at all it'll be by some third-party wrapper on PyPI, and maybe the author of that wrapper happens to not care about Twisted and only needs the wrapped-socket support. If so, and if the PEP allows them the choice of between implementing the sans-IO API and then using a concrete socket wrapper, or else implementing the socket wrapper directly, they can pick whichever one makes life easier. And arguably:

it's always going to be easier to write an NSS implementation that wraps a socket because that's the default mode in which it is deployed by almost all applications.

But I'm not so sure this is true? Sure, if we just look at the size of the code implementing the core talking-to-NSS functionality, then obviously the version that uses NSS's existing socket APIs will be simpler than the version that implements a buffer interface. But that's not the whole story. For the buffer interface, once you implement the handful of methods described in the PEP, you're done. For the socket-wrapper, you then need to reimplement the entire Python socket API from scratch on top of the NSPR abstractions. This is seriously non-trivial, because the Python socket API is a beast:

>>> len(dir(ssl.SSLSocket))
>>> 89

And then there's one more source of complexity to consider: realistically, you're not going to perfectly emulate the Python socket API; there's going to be weird corner cases you get wrong. (Like, "when this particular set of events occurs with this timing then in the regular wrapper implementation you get an OSError with errno X but in this implementation you get errno Y instead", that kind of thing.) So we also should count the complexity cost of downstream users who have to discover and come up with workarounds for these kinds of issues. In the buffers-only approach, this isn't an issue, because there's only one wrapper implementation that downstream users have to test against.

So taking a holistic perspective and just looking at complexity alone, I don't see any cases where allowing backends to skip implementing the buffers interface is a clear win.

@Lukasa
Copy link
Owner Author

Lukasa commented Feb 8, 2017

@njsmith I feel like you wrote a very long post that didn't address my most recent suggestion, which is to keep the TLSWrappedSocket ABC but also provide a concrete default implementation.

I understand and sympathise with @glyph's argument here, and but for NSS I think I'd be willing to bite. But I think NSS is a use-case we need to support: it's fairly widely deployed, and is useful to get stakeholder support. What I want to know is: how much does it hurt to do that? If the answer is "a lot" then I'm inclined to want to give NSS an escape hatch, whereas if the answer is "only a bit" then I'm much more willing to say "oh well, nvm then, we can make NSS contort to fit".

I'm reluctant to be too ideological here: this PEP is ultimately about pragmatically attempting to solve the problem of Python distribution rather than a treatise on good system design, and so I'm entirely happy to compromise on the sans-IO approach in order to build a bigger tent. s2n doesn't bother me (it's not heavily used), but NSS does, and so if getting NSS to work with buffers is going to double or triple the workload required to get it done then I'm disinclined to impose that cost.

I think someone needs to step up to investigate @njsmith's suggestion for now to make NSS be sans-IO'y, and to see if it actually works.

@hynek
Copy link

hynek commented Feb 8, 2017

I have to agree with Cory here: while I love sans-IO and async, I really hope we can keep these ideological considerations out of this PEP. Realistically speaking, most people care only about I/O-based wrappers (at least for now). Forcing an interface on them that treats this approach as a second class citizen (which usually means a performance penalty too) would just undermine the relevance and adoption of this PEP. And I really want this PEP to succeed for it seems like a viable way of Python’s valley of TLS sadness.

IOW: define both, mandate neither. Let people pick what’s best for them and their TLS implementations.

@Lukasa
Copy link
Owner Author

Lukasa commented Feb 8, 2017

I'm even inclined to go a bit further than what @hynek wants and to provide a socket-in-terms-of-buffer object, and encourage implementations that want to to use it, but to just not slam the door in the face of those that want to just provide the wrapped-I/O approach.

@glyph
Copy link

glyph commented Feb 8, 2017

I think NSS is a use-case we need to support: it's fairly widely deployed, and is useful to get stakeholder support. What I want to know is: how much does it hurt to do that? If the answer is "a lot" then I'm inclined to want to give NSS an escape hatch, whereas if the answer is "only a bit" then I'm much more willing to say "oh well, nvm then, we can make NSS contort to fit".

Having reviewed the NSS docs a bit now, I can't say for sure, but it seems like there are two answers.

From the perspective of "how hard is it for someone who doesn't really know NSS to do", all of the example code for NSS does use blocking sockets, and so that would be the easiest to find some sample to crib from.

However, NSS depends heavily on NSPR, and NSPR has a robust and well-documented I/O abstraction layer. If you have a good grasp of how these layers interact with each other, then doing the work to put together a buffered I/O abstraction would be only very slightly more difficult (as you'd have to implement a few callbacks to populate your buffer). So while it's a little harder to find an example, the implementation work is nearly equivalent.

@njsmith
Copy link

njsmith commented Feb 9, 2017

@Lukasa: I'm confused about why you're confused :-).

My understanding is: you propose that we allow (but don't require) backends to take full responsibility for implementing a socket wrapper object, and make the buffer-based interface optional. Your main motivation for this is that for backends like NSS (or maybe... just NSS), you think this might be less complex than requiring them to implement the buffer-based interface.

My concern about this isn't ideological; it's that the Python socket interface is extremely complicated and constantly evolving, so in practice backends are going to get it wrong and get out of sync with the stdlib, and user code is going to get stuck debugging broken socket APIs. As Glyph can tell you, socket APIs are already a mess to work with without us adding more broken wrappers :-). Plus, it provides only partial functionality compared to the buffers approach. So I think we should only allow this (even as a "compromise") if we really are getting a benefit from it. And you claim that benefit is in terms of complexity, so my message was an analysis of complexity :-).

And the main point is that I disagree with your assessment: AFAICT, even NSS actually isn't going to be easier to implement a socket wrapper directly. Sure, there are better examples for how to use the NSS API in the basic direct-blocking-I/O mode, but that's only half the battle. If you aren't going to use the buffers-to-socket-wrapper adapter, then you're also stuck implementing the whole Python socket API from scratch. Those example don't show how to use the NSPR API to correctly emulate Python's semantics for sendmsg in all possible socket states (disconnected/non-blocking/blocking-with-timeout/...).

Am I correctly understanding your argument? If so, does that make sense as a critique?

@Lukasa
Copy link
Owner Author

Lukasa commented Feb 9, 2017

Am I correctly understanding your argument? If so, does that make sense as a critique?

Yes, and yes. What I remain unsure about is whether we really want to go so far as to say "and therefore we will not provide a wrapped socket ABC but only a concrete implementation".

I buy the argument that says "wrapping sockets is hard, don't do it yourself", but I don't know that I buy the extension that says "and therefore we will not spec it". Still, perhaps I'm just wrong on this and haven't fully internalised that yet.

@Lukasa
Copy link
Owner Author

Lukasa commented Feb 9, 2017

Ok, I'm going to put aside my own misgivings for a moment. I've reworked the PEP to include the concrete wrapped socket. I'm going to run this past the security-SIG in that form once more to sanity-check before I take it to python-dev.

@reaperhulk
Copy link

✨ to @Lukasa for doing all this work so far.

@njsmith
Copy link

njsmith commented Feb 10, 2017

Regarding the problem of making a buffer-based wrapper for NSS, I spent a few minutes grepping the code. Figured I might as well right it down here in case it saves someone some time in the future... As far as I can tell, these are the socket callbacks that you have to implement (i.e., the ones that NSS will actually call at some point during normal usage):

  • send, recv: obviously.
  • connect: Called by ssl_SecureConnect. I'm not sure if this gets called on the normal path, but worst case we can define a connect that does nothing and claims success.
  • shutdown: This can get called in a few places, including during handshake failure. Specifically, if a server is configured to require valid certificates from clients, and a client doesn't provide one, then instead of trusting regular error reporting NSS will immediately shutdown the socket to make sure that the server notices the problem. So... that's a thing. Fortunately this is not too hard to emulate (just set an internal flag that makes future send/recv calls fail).
  • getpeername (!?): This is used to (a) check if a socket is connected, (b) get the ip+port to use as keys into the internal session cache. It looks like these are only used if opts.noCache is set, though, so a wrapper could set this and then return arbitrary values here. Or if you wanted to get fancy your TLSBufferedWrapper could expose an extension method to set the remote host/ip.
  • close: to free resources, of course.

There also appears to be some code that blocks the handshake while it goes off to do an OCSP check, but I believe that's disabled by default.

So you have to implement the buffering logic and of course all the TLS configuration and such, and then beyond that there are a few extra fiddly bits, but it doesn't look too bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.