Skip to content

Tagging and Signalling

Sergey Frolov edited this page May 21, 2020 · 1 revision

Moved from the old wiki, last edit was on Dec 20, 2018.


Current TapDance signaling layer is unidirectional: server could send signals to client, but client sends data to server as is without any headers.

Tag is located using offset of 252 bytes from the end: Host, URI now could be used, request is (still) paddable and extensible.

Requests may be Incomplete(no \r\n\r\n at the end) and Complete.


TapDance initial request

GET /anything HTTP/1.1
Host: anyhost.tld
Header1: foo
X-Proto: <IV for Protobuf Encryption><Encrypted Protobuf>
Header2: bar
X-T: <Padding><Base64 encoded tag>

TapDance initial request contains the tag, which station locates by fixed offset from the end of the first application layer packet. Tag begins with representative of client's public key, that was transformed with Elligator. The representative is used by the station to decrypt the payload, which, among other things, has TLS session data, needed to decrypt the whole HTTP request and further connection.

size (bytes) Field
variable HTTP request Beginning
32 Representative
129 Payload

HTTP request Beginning

The request may have any URI and Host, and will generally simply request decoy's /.

The request may include any kind of headers, which will only be read by the decoy, if the request is Complete.

One header that station will look for is "X-Proto", which is base64-style encoded. Note that station will first have to decrypt the payload, and decrypt the TLS connection to see this header. Decoded header's value starts with 12-byte long IV, followed by encrypted ClientToStation protobuf. The key for encyption is the same, that was used to decrypt the payload.

Representative

Representative is the client's key, transformed by Elligator. Elligator is a fancy way to hide a client's public key(which is a point on a curve), such that its representation looks random, thus indistinguishable for censor.

Client will use it's own private key and station's public key to encrypt the payload. Station will use the representative to combine client's public key with station's private key, and decrypt the payload.

Read the paper about Elligator: https://cr.yp.to/elligator/elligator-20130527.pdf

Payload

Decrypted payload contains following fields:

Field Size Comment
Flags bitfield 1 byte 3 bits are currently assigned
Unassigned 1 byte
Ciphersuite ID 2 bytes ciphersuite negotiated with decoy
Master key 48 bytes used to encrypt/decrypt further app data
Server Random 32 bytes generated by server during TLS handshake
Client Random 32 bytes generated by client during TLS handshake
Remote Connection Id 16 bytes used for reconnection

Initial Request Total Size

  • All fields in both tables above have their sizes listed before base64-style encoding, and will take 33% more space on the wire.
  • Additionally, TCP packet has 16 bytes of AES-GMAC(of TLS encryption of whole Request) in the end.
  • As a result, station looks for the tag at (32+129+16+3)*8/6+4+16=256 bytes offset from the end.

Reverse Encryption

In order for station to be able to decrypt the tag, it needs to find plaintext, which after being encrypted by user tls library, will contain the target tag.

To do that, client recovers the keystream of the cipher stream, that will be XORd with plaintext by TLS library later, and XORs desired ciphertext with that keystream.

base64-style encoding

We cannot choose arbitrary plaintext, as we have to be in ascii range, and use base64-style encoding, therefore first 2 bits of every byte are unusable. As a result, for every 8 bits of plaintext, we can only use 6.

Payload AES-GCM Key and IV recovery

When station recovers client's public key(see section \ref{elligator}) and combines it with station's private key, it gets 32 byte long shared secret.

First 16 bytes are AES-GCM key for the encrypted payload. Next 12 bytes are AES-GCM IV for it.


TODO: expected behavior of client and station when things go right, and when things go wrong.