Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent (de)serialization behavior #57

Closed
siennathesane opened this issue Aug 19, 2022 · 2 comments
Closed

Inconsistent (de)serialization behavior #57

siennathesane opened this issue Aug 19, 2022 · 2 comments

Comments

@siennathesane
Copy link

I am building an application protocol with protobufs, and I'm using vtprotobuf exclusively to marshal and unmarshal the messages. Currently, I'm experiencing strange behavior I'm not understanding that I think is related to vtprotobuf.

Here are my message definitions:

message Header {
  fixed32 Size = 1; // Size of the next message
  fixed32 Checksum = 2; // Checksum of the serialized message
}

message RaftControlPayload {
  oneof Types {
    GetLeaderIDRequest GetLeaderIdRequest = 1;
    GetLeaderIDResponse GetLeaderIdResponse = 2;
    IdRequest IdRequest = 3;
    IdResponse IdResponse = 4;
    IndexState IndexState = 5;
    ModifyNodeRequest ModifyNodeRequest = 6;
    ReadIndexRequest ReadIndexRequest = 7;
    ReadLocalNodeRequest ReadLocalNodeRequest = 8;
    RequestLeaderTransferResponse RequestLeaderTransferResponse = 9;
    RequestSnapshotRequest RequestSnapshotRequest = 10;
    SnapshotOption SnapshotOption = 12;
    StopNodeResponse StopNodeResponse = 13;
    StopRequest StopRequest = 14;
    StopResponse StopResponse = 15;
    SysOpState SysOpState = 16;
    DBError Error = 17;
  }
  enum MethodName {
      ADD_NODE = 0;
      ADD_OBSERVER = 1;
      ADD_WITNESS = 2;
      GET_ID = 3;
      GET_LEADER_ID = 4;
      READ_INDEX = 5;
      READ_LOCAL_NODE = 6;
      REQUEST_COMPACTION = 7;
      REQUEST_DELETE_NODE = 8;
      REQUEST_LEADER_TRANSFER = 9;
      REQUEST_SNAPSHOT = 10;
      STOP = 11;
      STOP_NODE = 12;
  }
  MethodName Method = 18;
}

This message serializes to 10 bytes, which I send across a network stream as a header for whatever unknown message payload is coming next. This allows me to simply pass raw protobuf messages across a network stream without having to leverage gRPC or other RPC frameworks.

Sending a message across the network stream is pretty straightforward. I prepare a message, serialize the message, create a header with all of the appropriate values, serialize the header, send the header, then send the message.

idReqPayload := &database.RaftControlPayload{
	Method: database.RaftControlPayload_GET_ID,
	Types: &database.RaftControlPayload_IdRequest{
		IdRequest: &database.IdRequest{},
	},
}
payloadBuf, _ := idReqPayload.MarshalVT()

initialHeader := &transportv1.Header{
	Size: uint32(len(payloadBuf)),
	Checksum: crc32.ChecksumIEEE(payloadBuf),
}
headerBuf, _ := initialHeader.MarshalVT()

stream.Write(headerBuf)
stream.Write(payloadBuf)

Receiving a message on the network stream is also pretty straightforward. I read the header into a buffer, deserialize it, read the next N bytes from the stream based off the Size field in the header message, and verify some checksums, then serialize the byte array into the equivalent messages.

headerBuf := make([]byte, 10)
if _, err := io.ReadFull(stream, headerBuf); err != nil {
	logger.Error().Err(err).Msg("cannot readAndHandle raft control header")
	continue
}

// marshall the header
header := &transportv1.Header{}
if err := header.UnmarshalVT(headerBuf); err != nil {
	logger.Error().Err(err).Msg("cannot unmarshal header")
	return
}

// prep the message buffer
msgBuf := make([]byte, header.Size)
if _, err := io.ReadFull(stream, msgBuf); err != nil {
	logger.Error().Err(err).Msg("cannot read message payload")
	return
}

// verify the message is intact
checked := crc32.ChecksumIEEE(msgBuf)
if checked != header.GetChecksum() {
	logger.Error().Msg("checksums do not match")
}

// unmarshal the payload
msg := &database.RaftControlPayload{}
if err := msg.UnmarshalVT(msgBuf); err != nil {
	logger.Error().Err(err).Msg("cannot unmarshal payload")
}

Here's where things start to get confusing. When I serialize idReqPayload via MarshalVT() and run a checksum against it, I'll get uint32(1298345897); when I send the header as you see here, the Size field is uint32(5) and Checksum is uint32(1298345897). When the header message gets deserialized on the receiving end of a localhost connection, it looks very different.

The header message gets deserialized with the Size field being uint32(5) and the Checksum field being uint(1). That's the first strange thing.

When I run a checksum against the next 5 bytes of the serialized idReqPayload payload which followed, it checksums to uint32(737000948) even though there was no change to the byte array from the time it was serialized to the time it was received. That's the second strange thing.

When I run an equality check against the value of the deserialised header Checksum field against a local checksum of the serialized idReqPayload payload with checked := crc32.ChecksumIEEE(msgBuf); if checked != header.GetChecksum() { // ... }, it passes an equality check - the deserialized header Checksum field's value is uint(1) whereas the calculated checksum of the received message is uint32(737000948). That's the third strange thing.

When I deserialize the serialized idReqPayload byte array, it deserializes without an error. However, the message information is incorrectly serialized. When I serialize protobuf with this configuration:

idReqPayload := &database.RaftControlPayload{
	Method: database.RaftControlPayload_GET_ID,
	Types: &database.RaftControlPayload_IdRequest{
		IdRequest: &database.IdRequest{},
	},
}

It deserializes into this equivalent:

msg := &database.RaftControlPayload{
	Method: database.RaftControlPayload_ADD_NODE,
	Types: nil,
}

The Method field is reset so the enum is defaulted to 0, and the Types field is nil.

I'm fairly positive this could partially be related to #51, but I updated my local protoc-gen-go-vtproto binary to 0ae748f and the problem still persists. I've also eliminated the network stream as it's a localhost network stream, so nothing is intercepting it or modifying it in transit.

Am I doing something wrong or is this a bug of some kind?

@vmg
Copy link
Member

vmg commented Aug 19, 2022

This does seem like strange behavior, but I'm not sure it's actually related to vtprotobuf. The first thing I would try is disabling the vt serialization and just using the original Protocol Buffers golang marshalling code to see if the weird behavior still persists -- if that's the case we can look into it further.

@siennathesane
Copy link
Author

The issue isn't related to vt, but to how protobufs are encoded. A fixed32 field will not encode to the full 4 bytes if the value is zero, causing the size of the protobuf to be 5 bytes instead of 10 bytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants