diff --git a/01-messaging.md b/01-messaging.md index 305d67fec..04cb373da 100644 --- a/01-messaging.md +++ b/01-messaging.md @@ -140,7 +140,8 @@ according to the message-specific format determined by `type`. ### Requirements The sending node: - - MUST order `tlv_record`s in a `tlv_stream` by monotonically-increasing `type`. + - MUST order `tlv_record`s in a `tlv_stream` by strictly-increasing `type`, + i.e. MUST not produce more than a single TLV record with the same `type` - MUST minimally encode `type` and `length`. - When defining custom record `type` identifiers: - SHOULD pick random `type` identifiers to avoid collision with other @@ -156,7 +157,8 @@ The receiving node: - MUST stop parsing the `tlv_stream`. - if a `type` or `length` is not minimally encoded: - MUST fail to parse the `tlv_stream`. - - if decoded `type`s are not monotonically-increasing: + - if decoded `type`s are not strictly-increasing (including situations when + two or more occurrences of the same `type` are met): - MUST fail to parse the `tlv_stream`. - if `length` exceeds the number of bytes remaining in the message: - MUST fail to parse the `tlv_stream`. @@ -180,8 +182,8 @@ encoded element. Without TLV, even if a node does not wish to use a particular field, the node is forced to add parsing logic for that field in order to determine the offset of any fields that follow. -The monotonicity constraint ensures that all `type`s are unique and can appear -at most once. Fields that map to complex objects, e.g. vectors, maps, or +The strict monotonicity constraint ensures that all `type`s are unique and can +appear at most once. Fields that map to complex objects, e.g. vectors, maps, or structs, should do so by defining the encoding such that the object is serialized within a single `tlv_record`. The uniqueness constraint, among other things, enables the following optimizations: @@ -192,6 +194,21 @@ things, enables the following optimizations: - variable-size fields can reserve their expected size up front, rather than appending elements sequentially and incurring double-and-copy overhead. +The main reason for unique occurrences of some TLV record type relates back the +fact that if a decoder doesn't recognize a field, it can't be certain whether +there should only be one, or if multiple are permitted. Asserting this would +require some knowledge of what the field encodes or a scheme from which it can +be inferred knowing only the type, e.g. (a silly example) only prime types can +have multiple records. The first defeats the purpose of the proposal, and the +latter is cumbersome in practice and would further fragment the type space +beyond the existing even/odd rule. + +A minor reasons for strictly-unique types is that permitting multiple of the +same type is less efficient on the wire due to the overhead of redundant +type-length pairs. It is also less efficient when allocating memory for the +decoded objects, e.g. one can't allocate the exact size of a vector upfront and +will incur more copying overhead when appending each item individually. + The use of a varint for `type` and `length` permits a space savings for small `type`s or short `value`s. This potentially leaves more space for application data over the wire or in an onion payload. @@ -475,7 +492,7 @@ decoded with BigSize should be checked to ensure they are minimally encoded. The following is an example of how to execute the BigSize decoding tests. ```golang func testReadVarInt(t *testing.T, test varIntTest) { - var buf [8]byte + var buf [8]byte r := bytes.NewReader(test.Bytes) val, err := tlv.ReadVarInt(r, &buf) if err != nil && err.Error() != test.ExpErr {