Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new take on protobuf optional fields... #252

Closed
gmaclennan opened this issue Sep 9, 2024 · 2 comments · Fixed by #254
Closed

A new take on protobuf optional fields... #252

gmaclennan opened this issue Sep 9, 2024 · 2 comments · Fixed by #254
Assignees
Labels

Comments

@gmaclennan
Copy link
Member

gmaclennan commented Sep 9, 2024

I was reviewing some code here today and remembered why we originally had some required fields as optional on the proto definitions.

TL;DR I think any required field that is a boolean or number type should be optional in the proto definition, but we could choose to just ignore this issue.

Why? Because all fields are optional in proto3, and a missing field is decoded as the default value (0 for a number type, false for a boolean). Because the default values of number and boolean types has a specific meaning, there is no way to distinguish when decoding between a "missing" value (e.g. the protobuf was encoded without the value) and an actual value.

e.g. given this protobuf definition:

message Test {
  bool field1 = 1;
}

If I decode an empty arraybuffer Test.decode(new Uint8Array()) then this will decode to { field1: false }.

However if I use this protobuf definition:

message Test {
  optional bool field1 = 1;
}

Then decoding an empty arraybuffer will result in { field1: undefined }

If we are treating some fields as "required", e.g. they should always be set by the client, then if they are not marked as optional, they could actually be missing, and we could unintentionally get the default value back.

This is ok for bytes and string field types, because we can check for an empty Buffer or empty string if they are actually required.

This is an issue though for the number and boolean types. Let's say we have required field for longitude. If we have a protobuf definition:

messate Test {
  optional double longitude = 1;
}

then if we decode a buffer that is missing this field, we will get back an object with longitude: 0, and would be unable to distinguish it from a message where longitude is intentionally 0.

So how do I actually get required fields?

There is no reliable way to "enforce" required fields within the decoder. The default behaviour in standard protobuf decoding to use the default value is probably not what we want for most required fields use-cases. Instead we need to have explicit actions in our own decode code to detect when they are missing (which confusingly requires us to mark them as optional on the protobuf), and then take the appropriate action, which may be to throw an error (e.g. just not decode / ignore the message), or could be to set a default value, or could be to remove a property that has no meaning without the required values.

Breaking change?

Adding optional to a number or boolean field will not break decoding (as in, messages will be decoded without error), but the decoded value will change. In the example above, if we start with the non-optional boolean field1 and we have a message that was encoded from { field1: false }, then if we change the proto definition to set that field as optional, then these existing messages will decode as { field1: undefined }.

Do we even need to worry about this?

Our types enforce fields being required at the API level, so there should not be any cases where a required field is missing unintentionally. Our encode() functions would throw an error if we actually did not pass a required field to encode(). The only case where it is an issue is if buggy code had encoded a message, or we have an old message type without the field.

So... We probably don't need to worry about this, but if we do actually want to be strict about "required" fields, then we should mark them as optional, check for them in our own decode / convert functions, and throw if they are missing.

@tomasciccola
Copy link
Contributor

Thanks gregor for documenting this. I think this was the original rational for the decision regarding optional fields. I think this back and forth spawned for the fact that we didn't document this anywhere. Sometimes is hard to know which decisions we need to thoroughly document...

@EvanHahn
Copy link
Contributor

If we were only concerned with protobufs, we couldn't make this change in a perfectly backwards-compatible way. However, we could safely make this change if we don't have a bug where we accidentally serialize something invalid...I think.

Because this involves long-term data storage, I'd feel much better if we fixed this before MVP. I don't think it's too nasty to fix, so I'll give it a look tomorrow.

EvanHahn added a commit that referenced this issue Sep 11, 2024
A required field, confusingly, should probably be marked *optional* in
the protobuf. That way, we can throw errors when decoding. See [#252]
for more details.

This change:

- Updates `.proto` files to mark all required fields with our custom
  `required` tag
- Updates `.proto` files to mark most required fields as `optional`, so
  we can throw errors when decoding
- Throw errors when decoding if a field is missing

I tested this end-to-end.

[#252]: https://github.com/digidem/mapeo-schema/issues/252
@EvanHahn EvanHahn self-assigned this Sep 12, 2024
@EvanHahn EvanHahn added the mvp label Sep 12, 2024
EvanHahn added a commit that referenced this issue Sep 16, 2024
A required field, confusingly, should probably be marked *optional* in
the protobuf. That way, we can throw errors when decoding. See [#252]
for more details.

This change updates how we handle required fields across the schema.

I tested this end-to-end, going "all the way" to comapeo-mobile.

[#252]: https://github.com/digidem/mapeo-schema/issues/252

Co-authored-by: Gregor MacLennan <gmaclennan@digital-democracy.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants