-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: versioned protobufs #32
Conversation
It looks like the tests are failing because schemasPrefix.js needs to be updated to use the |
ooops, yeah, I forgot to commit that change... |
This is so that schemas that doesn't have schemaVersion can be generated too
] | ||
.encode(record) | ||
.finish() | ||
const partial = ProtobufSchemas[key].fromPartial(record) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Using fromPartial to initialized lists to an empty list (since optional repeated doesn't exists), means initializing every non passed field to the initial default value, which means required fields that are missing don't get catched...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could catch this before it gets to the encode step. Is this only applicable to required arrays?
* added optional fields to filter and preset protobufs * added field protobuf and schema * type is optional on Preset_1 so need to be made optional on base type (and deleted before validation) * schemaVersion is optional on Field_1 so need to be made optional on base type (and deleted before validation) * this means utils.js/formatSchema{Key,Type} no takes optional values
} | ||
}] | ||
}, | ||
"type": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- in Field_1, type doesn't mean "which type of document am I?" but and enum of possible fields :|. This means Field_1 can't be validated yet (thinking about ways to avoid this issue...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make an issue for this and come back to it. In the short-term I think we might not need to validate Field
but likely will eventually.
The only solution to this I can think of so far is renaming our type
field for validation to something like schemaType
so that it matches schemaVersion
and is less likely to conflict with type
used as a property name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we can make that case for new schemas, but for old schemas the type still mostly refers to schemaType so we will still need to handle that manually
Casing on a doc type shouldn't be handled specifically by the code. Nevertheless, some schemas enforce a type to a specific value (type must be i.e. 'observation', which is different from 'Observation'). This means that type "observation" and "Observation" should have different dataTypeIds. That way we know what "type" to fill-in on returning the decoded object to be validated... If we want the same `dataTypeId` for 'observation' and 'Observation' (since we know we're actually talking about roughly the same type of 'document'), we could format the schemasPrefix object differently like: Observation: `{dataTypeId: '924892', versions: {'observation': 4, 'Observation': 5}}` or something less worse...
Observation: '71f85084cb94', | ||
const schemasPrefix = { | ||
Observation: { dataTypeId: '71f85084cb94', schemaVersions: [5] }, | ||
observation: { dataTypeId: '70f85084cb94', schemaVersions: [4] }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Casing on a doc type shouldn't be handled specifically by the code. Nevertheless, some schemas enforce a type to a specific value (type must be i.e. 'observation', which is different from 'Observation'). This means that type "observation" and "Observation" should have different dataTypeIds. That way we know what "type" to fill-in on returning the decoded object to be validated...
If we want the same dataTypeId
for 'observation' and 'Observation'
(since we know we're actually talking about roughly the same type of
'document'), we could format the schemasPrefix object differently like:
Observation: {dataTypeId: '924892', versions: {'observation': 4, 'Observation': 5}}
or something less worse...
@sethvincent @gmaclennan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah that's kinda tricky to have to check for both.
I think ideally it would stay lowercase but maybe that's awkward for code generation? I'm not quite familiar enough with the code generation to know if it could be a matter of capitalizing the first letter somewhere in the script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, from the point of view of code generation, that would mean changing the schemas and how they are compiled, basically setting all 'type' fields to a capitalized version or lowercase version. There are some schemas that doesn't have a type field, or that the type field is used for something completely different, and we would need to handle that manually.
I don't think that would be a good approach.
In terms of complexity, the current way is simpler IMO.
I don't think there's a non-weird approach though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm not sure why observation
changed to Observation
. If that's a needed change we should make sure other schemas follow the same pattern. coreOwnership
would be CoreOwnership
, filter
to Filter
for example.
What if we had something like this and handle the little o to big O within this module? (For example, transform lowercase o to capitol and match based on the schema version)
Observation: { dataTypeId: '71f85084cb94', schemaVersions: [4, 5] },
I think keeping the same dataTypeId for the two schemas makes sense (in part because v5 is just an improvement on v4).
This is looking good! There's a set of potential problems around type naming and exporting that I've been thinking about today. One issue I'm seeing is in cases where someone would import both the protobuf type and the jsonschema type in one file there might be naming conflicts. It might be ideal to have a naming convention like:
Kinda related: it looks like the protobuf comments export names like I have mixed feelings on whether the public types should include the version in the name. The main place that might matter is in migration code where it's needed to be explicit about versions and use multiple versions in the same code. Ideally we could import latest version as just the name: import { ObservationJson, ObservationProtobuf } from 'mapeo-schema/types' As well as being more specific about the version needed: import { ObservationJsonv4, ObservationJson5 } from 'mapeo-schema/types' A type without a version would be the latest version. I can imagine there's a good argument for always including the version number and not having the convenience of importing the latest version by just the schema name. Those examples also simplified importing the types from This would be nice especially for our jsdoc typing where long paths get messy looking:
versus:
|
Protobufs are all part of the same package (mapeo), and it doesn't allow two Message types with the same name, thats why the version is encoded on the type. But we could do smth with code generation where we rename the export (maybe with a union of every version defaulting to the last one?) |
@sethvincent do you want me to merge this already? I'll create additional issues to handling pending things as per your comments |
@tomasciccola sounds good! |
No description provided.