-
Notifications
You must be signed in to change notification settings - Fork 310
Steganography and encryption
Cloak works fundamentally by shaping whatever traffic is sent through it as HTTPS traffic. This would have been a trivial task if Cloak simply tunnels the traffic through TCP port 443 and add the TLS headers. It's able to trick simple classifiers like the one Wireshark uses, which looks only for fixed byte patterns and magic numbers; this is perhaps what most commercial firewalls and ISPs in non-authoritarian countries do. However since our state-level adversaries have been known to employ sophisticated deep-packet-inspection techniques looking for "fingerprints" throughout an entire network session, it is a non-trivial task to defeat these measures.
In order to be ready to serve as a transparent proxy between a non-Cloak visitor and a normal website, Cloak must be able to establish the visitor's identity without any round trip, i.e. upon the reception of the first packet.
To do this, Cloak's server first generate a static pair of elliptic-curve Diffie-Hellman key and distribute the public key to potential clients. Cloak's client then has to generate another ephemeral pair of ECDH key. The pre-distributed static public key and ephemeral private key are used to generate a shared secret, with which the Client's secret identifier (UID), along with some miscellaneous information (such as the reply prevention timestamp), are symmetrically encrypted. The public part of the ephemeral key and the encrypted UID are sent together to the server in the first packet.
Upon the reception of the first packet, Cloak's server first does a simple check on its format. Then it calculates the shared secret from the private part of the pre-generated key and the client-generated public part of the ephemeral key. With this shared secret, the UID is decrypted. The UID is then queried in a database to see if it's authorised. The connection initiator's identity is therefore established.
A session key is then generated by the server, and encrypted with an authenticated encryption scheme using the shared secret calculated above as key. This encrypted session key is then sent to the client. Since the encryption is authenticated, the client knows that successful decryption of this must mean that the server was in possession of the shared secret. Since the shared secret could only be calculated with the static private key, the server must be in possession of the static private key. This validates the identity of the server to the client.
- Port number correlation. For instance, TCP/UDP 1194 is the default port for OpenVPN, and rarely used by anything else. Firewalls can simply block these ports with little collateral damage.
- Simple protocol recognition, such as the handshake protocol of SOCKS5.
- Deep packet inspection. Analysing the packet contents with more sophisticated methods to identify its protocol
- Traffic pattern recognition. Correlating metadata such as packet timing and packet size with specific proxy protocols.
- Probing. Sending carefully constructed packets to suspected proxy servers to trigger specific responses for positive identification 1.
- Traffic tempering. In cases where the proxy protocol did not use authenticated encryption (such as early versions of Shadowsocks), a MITM may alter bytes at particular positions (such as the bytes representing content length) to trigger identifiable reactions from the proxy server 2.
Cloak aims to address all of these attack vectors.
A 16-byte long UID is generated by the server and distributed to one user through a secure channel.
A pair of Curve25519 public and private key is generated by the server. They will be known as static public key and static private key. The static private key is kept secret on the server. The static public key is distributed publicly.
Redirection address is an IP address with TCP port. If the server determines any incoming connection does not belong to an authorised Cloak client user, the server will serve as a transparent proxy between this incoming connection and the redirection address.
The user decides the proxy method they wish to use, which is an ASCII string of maximally 12 characters. This is normally the name of the underlying proxy protocol the client wishes to use, such as "openvpn".
The user decides the encryption method they wish to use. In the configuration file this can be either plain
, aes-gcm
or chacha20-poly1305
. This is parsed as a single byte, which the client and the server has an agreement (hard-coded) on what value represents which algorithm.
The user decides the server name, which is a domain name. This will be transmitted in plaintext over the wire, and it's what the client would like the firewall to believe the client is visiting. Therefore it most be an innocent, unblocked domain name
A 32-bit unsigned integer session id is generated.
One or several TCP connection is established with the Cloak server. The amount of TCP connections is determined by client-side configuration. Each TCP connection undergoes its own handshake.
Another pair of Curve25519 public and private key is generated by the client. They will be known as ephemeral public key and ephemeral private key.
The client computes scalar multiplication from ephemeral private key and static public key, generating a 32-byte long shared secret.
An authentication data byte stream is then constructed:
UID | Proxy Method | Encryption Method | Timestamp | Session Id | Flag | reserved |
---|---|---|---|---|---|---|
16 bytes | 12 bytes | 1 byte | 8 bytes | 4 bytes | 1 byte | 6 bytes |
Currently there is only one flag option UNORDERED_FLAG
, this is used when the proxy client relies on UDP. Authentication data is 48 bytes long in total.
The ephemeral public key is then marshalled into a 32-byte long representation.
Authentication data is then encrypted using AES-GCM
(AES-GCM
is used regardless of encryption method), with the first 12 bytes of the marshalled ephemeral public key as nonce, and the 32-byte shared secret as key. This will produce a 48-byte long authentication ciphertext followed by a 16-byte long authentication tag.
A standard TLS1.3 ClientHello
message is composed. Since different applications and browsers add in slightly different data to their ClientHello
messages, some particular fields (such as Cipher Suites
) can be used for "fingerprinting" the application. Cloak will imitate the "fingerprint" of Chrome and Firefox.
The Random
field of ClientHello
is substituted with the marshalled ephemeral public key.
The Session ID
field of ClientHello
is substituted with the first 32 bytes of authentication ciphertext.
The "TLS extension" field Server Name Indication
is generated according to the server name set in the configuration file.
The extension field Key Share
will have a Key Share Entry
for Group x25519
(the identifiers are specified in RFC 8446). The Key Exchange
field is substituted with the remaining 16 bytes of authentication ciphertext followed by the 16 byte authentication tag.
This ClientHello
message is sent off to the server through this TCP connection.
The resulting ClientHello
looks like this:
TLSv1.3 Record Layer: Handshake Protocol: Client Hello
Content Type: Handshake (22)
Version: TLS 1.0 (0x0301)
Length: 512
Handshake Protocol: Client Hello
Handshake Type: Client Hello (1)
Length: 508
Version: TLS 1.2 (0x0303)
Random: 08c87c4bd3f392d7954c4d6444d5260a75a657c52943f08c… # marshalled ephemeral public key
Session ID Length: 32
Session ID: 622aa6f1c4b74920b5014eb2f6cbf1fb70166d325b9e8e1a… # the first 32 bytes of authentication ciphertext
Cipher Suites Length: 36
Cipher Suites (18 suites)
Compression Methods Length: 1
Compression Methods (1 method)
Extensions Length: 399
Extension: server_name (len=29)
Type: server_name (0)
Length: 29
Server Name Indication extension
Server Name list length: 27
Server Name Type: host_name (0)
Server Name length: 24
Server Name: www.google-analytics.com # Determined by Server Name configuration field
Extension: extended_master_secret (len=0)
Extension: renegotiation_info (len=1)
Extension: supported_groups (len=14)
Extension: ec_point_formats (len=2)
Extension: session_ticket (len=0)
Extension: application_layer_protocol_negotiation (len=14)
Extension: status_request (len=5)
Extension: key_share (len=107)
Type: key_share (51)
Length: 107
Key Share extension
Client Key Share Length: 105
Key Share Entry: Group: x25519, Key Exchange length: 32
Group: x25519 (29)
Key Exchange Length: 32
Key Exchange: 4cd2b4f9fae08192dfd15d8cac3467ba09948fcc0811a896… # remaining 16 bytes of authentication ciphertext followed by the 16 byte authentication tag
Key Share Entry: Group: secp256r1, Key Exchange length: 65
Extension: supported_versions (len=9)
Extension: signature_algorithms (len=24)
Extension: psk_key_exchange_modes (len=2)
Extension: record_size_limit (len=2)
Extension: padding (len=134)
The server will attempt to read a full TLS message according to the length specified in its "record layer". If the data does not have a valid "record layer", or if the server is waiting too long to receive the full length of data, the server will close the connection.
The server will attempt to parse the first TLS message as a ClientHello
message. If this message is not a ClientHello
, or if the message is malformed, the server will send the full TLS message to redirection address and turn into a transparent proxy between redirection address and the connection originator. Such action will be referred to as "rejecting" below. Rejection will always send the full first TLS message to redirection address first regardless of when the rejection happens. Otherwise the ClientHello
is unmarshalled into an object.
The Random
field of the ClientHello
message is looked up in its used random cache. If there is a hit, it means the ClientHello
message has been replayed, the connection is then rejected. Otherwise this Random
field is saved in the used random cache.
The Random
field is then unmarshalled as the ephemeral public key. The server computes scalar multiplication from its static private key and the ephemeral public key to derive the 32-byte shared secret.
The server reconstructs the authentication ciphertext from Session ID
field and the first 16 bytes of Group x25519
entry in the "TLS extension" field. If the said extension field is absent, the connection is rejected. Otherwise it then retrieves the authentication tag from the last 16 bytes of Group x25519
entry.
The authentication ciphertext and the authentication tag is decrypted and authenticated using AES-GCM
. If the authentication fails, the connection is rejected. Otherwise the authentication data is obtained in plaintext.
The authentication data is unmarshalled according to its construction method mentioned above. This gives the server the UID, proxy method, encryption method, timestamp and session id.
If the timestamp is outside of server's time plus or minus a tolerance, which is by default 3 minutes, the connection is rejected as a replay.
The proxy method string is checked against the proxy book entry in the server's configuration. If there isn't an exact string match, the connection is rejected.
The encryption method value is checked against a list of values the server understands and allows. If it's an unknown value, the connection is rejected.
A 32-byte random session key is then generated. If the encryption method is not "plain", then an AEAD cipher corresponding to encryption method is initialised with this session key. If it is "plain", the AEAD cipher object is nil
. The session key and a reference to the AEAD cipher are saved in a new obfuscator object.
The UID is checked against a list of allowed UIDs, if there is no match, the connection is rejected. Otherwise, if the user has another session currently actively connected to the server, the reference to an active user object representing this user is retrieved. If the user is not currently active, an active user object representing this user is created, and its reference is saved.
The server can now be confident that the connection originator is an authorised Cloak user, and its critical data has not been altered over the wire.
The active user object maintains a list of active sessions belonging to this user. The received session id is looked up in this list. If there isn't a match, it means this particular incoming TCP connection belongs to a new session. A new session object is created and its reference is stored in the current active user object. A reference to the previously created obfuscator object is saved in this new session object. If there is a match, it means this particular incoming TCP connection belongs to an existing session. A reference to this session object is retrieved and the reference to the current TCP connection is added to the session to be used. The previously generated session key and obfuscator are discarded and will no longer be used, and session key is retrieved from the session object.
A 12-byte long random nonce is generated. The session key, either previously generated or retrieved from an existing session, is encrypted using AES-GCM
(AES-GCM
is used regardless of the choice of encryption method) with the nonce and the shared secret as key. This produces a 32-byte long encrypted session key with a 16-byte long authentication tag.
A standard TLS1.3 ServerHello
message is composed.
The Random
field is substituted with the 12-byte long nonce followed by the first 20 bytes of encrypted session key.
The Session ID
field is the same as the Session ID
field in the original ClientHello
message.
The extension field Key Share
will have a Key Share Entry
for Group x25519
. Its entry is substituted with the remaining 12 bytes of encrypted session key, followed by the 16-byte authentication tag, followed by 4 random bytes.
A standard ChangeCipherSpec
data is also composed. Then some random bytes are generated as an "EncryptedCertificate"
message. There is no plaintext field indicating that it's an EncryptedCertificate
message. These random bytes are wrapped in a standard ApplicationData
record layer.
The server sends off ServerHello
, ChangeCipherSpec
and "EncryptedCertificate"
. The server has now completed its part of the handshake process and is ready to accept another TCP connection.
The resulting ServerHello
looks like this:
TLSv1.3 Record Layer: Handshake Protocol: Server Hello
Content Type: Handshake (22)
Version: TLS 1.2 (0x0303)
Length: 122
Handshake Protocol: Server Hello
Handshake Type: Server Hello (2)
Length: 118
Version: TLS 1.2 (0x0303)
Random: 50d772e2b56169c855e5220ef0d6dc505974fb6f88ff0591… # 12-byte long nonce followed by the first 20 bytes of encrypted session key
Session ID Length: 32
Session ID: 622aa6f1c4b74920b5014eb2f6cbf1fb70166d325b9e8e1a… # same as the Session ID field in the original ClientHello message
Cipher Suite: TLS_AES_128_GCM_SHA256 (0x1301)
Compression Method: null (0)
Extensions Length: 46
Extension: key_share (len=36)
Type: key_share (51)
Length: 36
Key Share extension
Key Share Entry: Group: x25519, Key Exchange length: 32
Group: x25519 (29)
Key Exchange Length: 32
Key Exchange: e7202cba5e8a9a704468a5df3b1dc5b05d2d4d50e3954094… # remaining 12 bytes of encrypted session key, followed by the 16-byte authentication tag, followed by 4 random bytes
Extension: supported_versions (len=2)
Followed by ChangeCipherSpec
TLSv1.3 Record Layer: Change Cipher Spec Protocol: Change Cipher Spec
Content Type: Change Cipher Spec (20)
Version: TLS 1.2 (0x0303)
Length: 1
Change Cipher Spec Message
Followed by an ApplicationData
TLSv1.3 Record Layer: Application Data Protocol: http-over-tls
Opaque Type: Application Data (23)
Version: TLS 1.2 (0x0303)
Length: 182
Encrypted Application Data: 3df418f47fba754f0e01a7eab3b522fd11519d648abcf707…
The client reconstructs the encrypted session key and the authentication tag from Random
and Group x25519
fields. They are decrypted and authenticated to produce session key in plaintext. If the authentication fails, the client will close the TCP connection with server.
The client will wait for ServerHello
from all TCP connections with server to be received and decrypted.
If the encryption method is not "plain", then an AEAD cipher corresponding to encryption method is initialised with this session key. If it is "plain", the AEAD cipher object is nil
. The session key and a reference to the AEAD cipher are saved in a new obfuscator object.
A new session object is initialised with a reference to the obfuscator object. References to all TCP connections with finished handshake are added to this session object.
The client will listen to connections from the proxy software client, and create a new stream each time the proxy client establishes a connection with Cloak client. That is, streams are one to one mappings of proxy software connections. Each time a new stream is established, Cloak server establishes a connection with the proxy server endpoint and pipe data from the stream through this connection to proxy server. There is no explicit message for notifying Cloak server the opening of new streams. A new stream is implicitly opened once the server receives a frame with an unseen stream id.
A frame is the unit of data transfer over the wire. Frames are in the form of TLS ApplicationData
.
This is the binary structure of a frame:
Stream ID | Sequence | Closing Flag | Overhead Length | Payload | Overhead |
---|---|---|---|---|---|
4 bytes | 8 bytes | 1 byte | 1 byte | Variable | Variable |
Stream Id, sequence, closing flag and overhead length together form the stream header
At first, the overhead length is determined. This depends on the encryption method used. If it is aes-gcm
or chacha20-poly1305
, then the overhead length is 16 bytes. If the encryption method is plain
and the length of payload is smaller than 8, the overhead length is the difference between the length of payload and 8; otherwise if the length of payload is greater or equal to 8, the overhead length is 0.
If the encryption method isn't plain
, then the payload is encrypted and a 16-byte authentication tag is generated with the specified encryption method, using previously agreed session key as key and the stream header as nonce. The authentication tag is appended to the encrypted payload to form frame data. Otherwise if the encryption method is plain
, random bytes of overhead length is generated and appended to the payload to form encrypted frame data.
The frame header is then encrypted using salsa20, with the previously agreed session key as key and the last 8 bytes of encrypted frame data as nonce.
The encryted frame header is then concatenated with encrypted frame data. An appropriate TLS ApplicationData
record layer is then composed. The record layer, followed by the encrypted parts, are then sent off over the wire to the other side of Cloak.