Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more specific data type when some pgv-rules are present #38

Closed
gontard opened this issue Nov 13, 2020 · 39 comments
Closed

Use more specific data type when some pgv-rules are present #38

gontard opened this issue Nov 13, 2020 · 39 comments

Comments

@gontard
Copy link
Contributor

gontard commented Nov 13, 2020

We are using scalapb, scalapb-validate and cats, we facilitate the manipulation of the generated classes, we would like to take advantage of both

For example in this case, we would like to expose a the items as a cats.NonEmptyList

  repeated string items  = 1 [(validate.rules).repeated.min_items = 1];

and here a cats.NonEmptySet

  repeated string items  = 1 [(validate.rules).repeated = { min_items: 1, unique: true }];

Since the validation is made on an instantiated object, the use of scalapb customization (scalapb.field).collection_type seems unadapted.
May be we could generate somehow a method that return the appropriate specific data type:

case class Message(items: Seq[String]) {
    def itemsNel: NonEmptyList[String] = NonEmptyList.fromListUnsafe(items.toList)
}

Is there a way to do that?

@thesamet
Copy link
Contributor

thesamet commented Nov 13, 2020

It looks like we're looking at two options:

  1. Have the constructor type automatically change to NEL whenever min_items is 1 and fail parsing if the rule is violated.
  2. Automatically generate xNel methods for each repeated field x that has min_items set to 1.

(2) is definitely easier - since it doesn't interfere with the parsing logic. Do both options work equally well for you?

@gontard
Copy link
Contributor Author

gontard commented Nov 16, 2020

@thesamet Thanks for summing that up.
This idea is quite new and i don't have yet enough perspective but right now the first solution is my favourite as it enforces these constraints at build time.

Since i opened the issue, i have encountered the same pattern not only for cats.NonEmptyList or cats.NonEmptySet but also for some my own datatype (for example Email).

@gontard
Copy link
Contributor Author

gontard commented Nov 17, 2020

@thesamet what would be nice is to somehow be able to map a set of PGV rules to a scala type:

def typeForRule(rules: io.envoyproxy.pgv.validate.FieldRules): Any {
    rules match {
        ....  => NonEmptySet
        ....  => Email
    }
}

@thesamet
Copy link
Contributor

This proposal makes sense, and I think this can make a nice addition to this library. Currently, there is no way for one plugin to change the types emitted by ScalaPB. We would have to come up with a new capability in core ScalaPB that would enable this. This appears to be high effort, and only applicable to a specific user group (scalapb-validate && Cats). It would be great if this work can be sponsored by one of the companies who use it, or someone putting together the necessary design and PRs for discussion.

@bjaglin
Copy link
Contributor

bjaglin commented Dec 16, 2020

For the record, I am happy to announce that Teads will be sponsoring this work!

@bjaglin
Copy link
Contributor

bjaglin commented Dec 16, 2020

Most of the PGV rules can "manually" be encoded/enforced by types by annotating fields with a scalapb option (provided that scalapb-validate is disabled, as it would fail on these custom types as @gontard noted above).

  1. Collections - (validate.rules).repeated & (validate.rules).map
    • (scalapb.field).collection_type
    • (scalapb.field).map_type
  2. Scalar - (validate.rules).<scalar>, (validate.rules).enum, (validate.rules).map.keys & (validate.rules).map.values
    • (scalapb.field).type
    • (scalapb.field).key_type
    • (scalapb.field). value_type
  3. Required - (validate.rules).message.required
    • ? (no way to skip the Option boxing)
  4. One of - (validate.required)
    • ?

This issue was originally targeted at (1), but (2) was introduced in the discussion.

In our own use-case:

  • (1) is the most interesting
  • (2) could be useful
  • (3) is currently irrelevant as we use proto2 (but having that feature would make it possible for us to rather easily switch to proto3 as the ability to declare fields required and avoid option unwrapping ceremony is what is holding us back)
  • (4) could be nice but seems quite complicated to implement, so probably out of scope

@bjaglin
Copy link
Contributor

bjaglin commented Dec 16, 2020

scalapb-validate currently produces separate objects with a validate() method that the client must call explicitly. If a new scalapb-validate-cats module is provided with NEL/NES presets (for example), ScalaPB stubs themselves would be impacted, raising exceptions at object creation/unmarshalling time without any client action.

However, the types might not capture all PGV constraints, so scalapb-validate validator objects should still be supported with these custom stubs I suppose. Does it make sense in that case to automatically call the validators somehow to avoid partial validation?

@thesamet
Copy link
Contributor

thesamet commented Dec 16, 2020

However, the types might not capture all PGV constraints, so scalapb-validate validator objects should still be supported with these custom stubs I suppose. Does it make sense in that case to automatically call the validators somehow to avoid partial validation?

If I understand correctly the preference is to have the case class constructor call pgv-validate so it is impossible to instantiate directly or through parsing partially invalid objects? This can be done as part of this change. This would be pretty substantial change for other users of scalapb-validate since they'll lose control of when the validation is done.

@bjaglin
Copy link
Contributor

bjaglin commented Dec 16, 2020

If I understand correctly the preference is to have the case class constructor call pgv-validate so it is impossible to instantiate directly or through parsing partially invalid objects?

That's correct. As a matter of fact, in our akka-grpc context that's what we do to enforce the rules on incoming request payloads, with an internal (fragile) subclass of scalapb.compiler.ProtobufGenerator adding a validator field and mixing-in

trait ScalaPBValidator[ScalaPB] {
  self: ScalaPB =>

  def validator: Validator[ScalaPB]

  validator.validate(self) match {
    case Success => ()
    case Failure(violations) =>
      val description = violations.map(_.getMessage).mkString("\n")
      throw new GrpcServiceException(INVALID_ARGUMENT.withDescription(description))
  }
}

I guess it will depend on the implementation, but what we could loose if we encode a repeated field with NEL would be a clear exception with the violated validations (if the NEL constructor is called first). That's why I brought up the subject of partial validation, but I haven't given it much thoughts yet.

@thesamet
Copy link
Contributor

Do we always want to have Set/NonEmptySet when unique: true is given? When converting to Set we lose the original order.

@bjaglin
Copy link
Contributor

bjaglin commented Dec 23, 2020

Do we always want to have Set/NonEmptySet when unique: true is given? When converting to Set we lose the original order.

Good point. At the moment we don't have a case where we need "ordered list without duplicates" semantics, but that's what it really is.

Which brings the question of opting-in for Set usage i guess. This might be out of scope for this, and I don't know how scalapb-validate-cats could declare options (under it's own registered extension?) f513bce, but controlling the behavior via an ordered boolean option would be useful. It doesn't change validation so it doesn't belong in PGV, but it does document the contract for non-scalapb users (something important for us in our gRPC use case), so it's not just a scalapb-specific implementation detail, but also a hint.

@gontard
Copy link
Contributor Author

gontard commented Dec 23, 2020

Do we always want to have Set/NonEmptySet when unique: true is given? When converting to Set we lose the original order.

May be we could also rely on a LinkedHashSet to preserve the order.

thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 24, 2020
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 24, 2020
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 24, 2020
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 24, 2020
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 24, 2020
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 24, 2020
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 27, 2020
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 27, 2020
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 27, 2020
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 27, 2020
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Dec 27, 2020
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
@bjaglin
Copy link
Contributor

bjaglin commented Jan 17, 2021

I believe it's not a requirement as it could operate on unknown ids?

Unknown fields don't have sufficient information about the structure for some of the things we do (for example, names in path references would get lost).

My bad, indeed I realized after writing this that we cannot operate on unknown ids if we want to be able capture arbitrary option values with a human-readable path syntax $().

Since scalapb/validate.proto imports validate/validate.proto, then this wouldn't have happened if the search was recursive

I just tried your recursive fix (with a local build of scalapb-validate with Scala PB -preview10), and unfortunately it's not enough as the same problem happens for regular (non-package.proto) files which include neither scalapb/validate.proto nor validate/validate.proto. I believe updating https://github.com/scalapb/ScalaPB/blob/76642ff0306c5c6ecfe944ba749c366846b8f2a8/compiler-plugin/src/main/scala/scalapb/compiler/FileOptionsCache.scala#L140 to accumulate extensions as we merge field transformations should be enough in that particular case.

@thesamet
Copy link
Contributor

In preview11, unique_to_set and cats_transforms have been moved to FileValidationOptions, which is an extension of ScalaPB validation option. Previously, they were in their own file-level options. The preprocessor injects the field transformations when encountering those settings. The field transformations can then be propagated to the rest the package when scope: PACKAGE is set. Since extensions in field transformations are resolved before propagation to other files, the situation described in the previous comment is no longer possible. Before preview11, field transformations were injected directly to each file under a package that had (scalapb.validate.package).cats_transforms which led to failure in extension resolution in files that didn't have the import.

@bjaglin
Copy link
Contributor

bjaglin commented Jan 19, 2021

Happy to report that with a local build of ScalaPB/scalapb-validate (preview11 + my last fix), we are utilizing all the features made available for this in our main project:

  • built-in Cats transforms
  • static, EXACT, adhoc transform based on PGV options
  • dynamic, PRESENCE, adhoc transform based on custom options extracting values with path expansion
  • package-level transformations opt-out (using package hierarchy)
  • file-level transformations opt-out

From our side, it looks like the preview branch is ready to be merged!

thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 19, 2021
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 19, 2021
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 19, 2021
This changes introduces the ability to write "-name" as a preprocessor
name to have it excluded at the file-level in the case where it is
inherited from the package level.

For scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 20, 2021
thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 20, 2021
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 20, 2021
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 20, 2021
This changes introduces the ability to write "-name" as a preprocessor
name to have it excluded at the file-level in the case where it is
inherited from the package level.

For scalapb/scalapb-validate#38
thesamet added a commit that referenced this issue Jan 20, 2021
@thesamet
Copy link
Contributor

thesamet commented Jan 21, 2021

Released preview14. This release generalizes again the DSL syntax. The changes are:

  • FieldTransformation.when matches on google.protobuf.FieldDescriptorProto (previously FieldOptions), so now it can match on things such as type: TYPE_INT32 and label: LABEL_REPEATED
  • FieldTransformation.set now takes google.protobuf.FieldOptions (used to be scalapb.FieldOptions. Currently the only field you are allowed to set in this release is [scalapb.field], though arbitrary options would be allowed in the future.
  • Path value interpolation ($(p)) is now relative to the FieldDescriptorProto of the matching field, use like this: $(options.[validate.rules].int32.gt)
  • Improvement in error messages.
  • new syntax is updated in docs via scalapb/ScalaPB@a61d421
  • scalapb-validate-demo bumped up to preview14 and the new DSL syntax.

@bjaglin
Copy link
Contributor

bjaglin commented Jan 21, 2021

We have successfully integrated preview14 and started leveraging the more generic syntax to expand the proto type (on top of field options).

thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 21, 2021
thesamet added a commit to scalapb/ScalaPB that referenced this issue Jan 21, 2021
@thesamet
Copy link
Contributor

We have successfully integrated preview14 and started leveraging the more generic syntax to expanding the proto type on top of field options.

Fantastic! For ScalaPB, I merged the 0.10.x-preview into 0.10.x and released version 0.10.10. The same changes have been forward-ported into the 0.11.x-preview branch which got merged today into the master branch. Similarly, in ScalaPB-validate 0.2.x-preview branch has been merged into master, and I released version 0.2.0. The instructions in the docs have been updated accordingly.

@gontard
Copy link
Contributor Author

gontard commented Jan 25, 2021

I am really happy to announce that we have just deployed our app with the official release version 🚀
Thanks a lot for the hard work @thesamet! It has been a very fruitful partnership!

@gontard gontard closed this as completed Jan 25, 2021
thesamet added a commit to scalapb/ScalaPB that referenced this issue Mar 17, 2024
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Mar 17, 2024
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Mar 17, 2024
MessageBuilder is a mutable class providing the ability to merge a
message from a CodedInputStream into the mutable state, and build an
instance of a message.

MessageCompanion.newBuilder returns a new Builder initialized with empty
values.

MessageCompanion.newBuilder(msg) returns a new Builder initialized
with values from a given instance of a message.

This new functionality allows us to create builders with partially
initialized objects that can not be represented by the actual message.

See #1013 and scalapb/scalapb-validate#38
thesamet added a commit to scalapb/ScalaPB that referenced this issue Mar 17, 2024
In order to support Collections that are quite different than the ones
in the standard library, we add a new `collection` option for repeated
fields. The collection can reference an `Adapter` type available at
runtime that will forward calls for an underlying collection.

This helps support types such as cats.data.NonEmptyList, NonEmptySet
and NonEmptyMap. For example, those collections don't define `foreach`,
and `++` where the rhs is an iterable. For `NonEmptySet` there is even
no `size` method.

See #1013 and scalapb/scalapb-validate#38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants