-
Notifications
You must be signed in to change notification settings - Fork 0
Protocol
The protocol used by the server is built on TCP and uses length prefixed messages.
The server processes streams of bytes over a TCP connection in the following steps:
- Parsing bytes into tokens
- Token processing into commands
The TCP-based protocol used by the server to process input is very simple, and simply encodes tokens to bytes and decodes bytes into tokens using a length-prefixed format. This means each token sent by the client should be prefixed by its length followed by the delimiter which is CRLF (carriage return line feed), or \r\n
. So every token should be sent over the server in the form <token_length>\r\n<token>
. Each token processed from the bytes sent by the client is appended to a deque of tokens server-side.
For example:
- If a client wanted to send the token
PING
to the server, they would send4\r\nPING
. - If we wanted to send the two tokens
GET
followed byKEY
, we would send3\r\nGET3\r\nKEY
.
If the server doesn't have enough bytes following the length-prefix specified, it will just wait for more input from the client. For example if the client sends 4\r\nPING
and server has only received 4\r
or 4\r\nPI
in its buffer, it will not process the bytes yet and will wait for more input from the client.
Bytes are processed into a deque of tokens which are then processed as commands sequentially. If there aren't enough tokens to process the command at hand, the server will just wait for more input from the client. Tokens that don't correspond to a command are discarded if they're assumed to be the start of a command.
For example, assume the client sends the SET command over the TCP connection:
3\r\nSET4\r\nname4\r\njohn1\r\n0
This would then get processed into the deque:
{SET, name, john, 0}
The server would then set the key name
to john
and clear those tokens.
Assume the client sends the following sequence of bytes over the TCP connection:
3\r\nSET4\r\nname4\r\njohn1\r\n05\r\nHELLO5\r\nWORLD3\r\nSET3\r\nage2\r\n211\r\n0
This would then get processed into the deque:
{SET, name, john, 0, HELLO, WORLD, SET, age, 21, 0}
The server would first set name
to john
, and remove the associated tokens. The deque after this operation is:
{HELLO, WORLD, SET, age, 21, 0}
The server would then discard HELLO
and WORLD
since they don't correspond to valid command names. The deque after this is:
{SET, age, 21, 0}
The server would now set age
to 21
, and remove the associated tokens. The deque after this operation is:
{}
Assume the client sends the following sequence of bytes over the TCP connection:
3\r\nSET4\r\nname
This would then get processed into the deque:
{SET, name}
Upon seeing this, the server would keep waiting for more input from the client since the SET command requires more arguments to execute.
The server prefixes all responses to the client with the type of response it is sending. There are 3 types of prefixes:
- VALUE
- ERROR
- FATAL
A value prefix is sent prefixing any regular response or value sent back by the server. For example, if a server wanted to send "PONG" back to the client, it would send the encoding: 5\r\nVALUE4\r\nPONG
.
An error prefix is sent prefixing any recoverable error message sent back by the server. This includes errors that can easily be recovered for example trying to create a stash that already exists. For example, if the server wanted to send "Invalid key" back to the client, it would send the encoding: 5\r\nERROR11\r\nInvalid key
.
A fatal prefix is sent prefixing any non-recoverable error message sent back by the server. This includes errors that are hard for the server to recover from. For example, if the server wanted to send "Disconnecting..." back to the client, it would send the encoding: 5\r\nFATAL16\r\nDisconnecting...
.
If a value is null, the server will return the value *NULL
. The full encoded response would be 5\r\nVALUE5\r\n*NULL
.
Commands sent through the TCP protocol to the server are case sensitive, and must be in the upper case. Optional argument names must also be in upper case.
A single optional argument should be specified in the form <ARG>=<value>
where <ARG>
is the name of the optional argument in upper case and <value>
is the value of the optional argument.
For example, GET name 1 NAME=names
would be encoded as:
3\r\nGET4\r\nname1\r\n110\r\nNAME=names
If the client sends invalid length-prefixed messages to the server, the behavior of the server is to disconnect the client. These include things like:
- Sending length prefixes that are not at least 1
- Sending invalid integers
For example, if the client sends 0\r\nPING
or abc\r\nPING
, they will be disconnected since 0 is not valid in the protocol and it leaves the server ambiguous on how to interpret the rest of the continuous stream of bytes over the TCP connection.
This should not be a problem if communicating with the server through the CLI, since input validation is first handled on the client-side.
The max buffer size is set to 3 times the max key length (256 bytes) and max stash length (65536 bytes), which is 197376 bytes
or ~.2 mb
(this is different from the buffer used by the Netty server to buffer incoming bytes sent over the TCP connection). If the client's buffer exceeds this limit before its commands are processed, they will be disconnected. In practice this shouldn't happen since the default Netty server behavior is to process 65536 bytes at a time, which means after at most 65536, command handling will occur to process commands and clean up the buffer.