-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid double validation of payloads #113
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bdfc679
to
48130ba
Compare
yndajas
approved these changes
May 22, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert on this, but it generally looks sensible to me. A few minor comments, including about the change in error messaging
The call to validate a schema is quite expensive: ``` [3] pry(#<GovukSchemas::RandomExample>)> puts Benchmark.measure { 100.times { JSON::Validator.fully_validate(@Schema, item, errors_as_objects: true) } } 7.614167 1.532485 9.146652 ( 10.468765) ``` This updates the code to avoid making two calls to the validator when a payload is generated with a block if the customised version is valid. Before: ``` irb(main):010:0> puts Benchmark.measure { 100.times { GovukSchemas::RandomExample.for_schema(frontend_schema: "news_artic le") { |schema| schema["title"] = "Custom"; schema } } } 13.308641 3.087679 16.396320 ( 18.977005) ``` After: ``` irb(main):002:0> puts Benchmark.measure { 100.times { GovukSchemas::RandomExample.for_schema(frontend_schema: "news_artic le") { |schema| schema["title"] = "Custom"; schema } } } 7.654303 1.561620 9.215923 ( 10.604802) ``` The motivation for this was that the GOV.UK Chat test suite makes quite heavy use of random schemas with customisations for checking supported schemas. On a MBP with GOV.UK Docker this reduces the time of the running the test suite from around ~55s to around ~35s.
48130ba
to
3a7e16b
Compare
Thanks for picking this up @yndajas 👍 |
kevindew
added a commit
that referenced
this pull request
May 23, 2024
The change in #113 didn't handle a common usage scenario where a user modifies the payload passed into the block (example usage [1]). In order to handle this we need to make a copy of the original payload before it is modified. As this project doesn't have an ActiveSupport dependency I wrote the equivalent of a deep_dup method. using `Marshal.load(Marshal.dump(object))`, with an expectation that given the data generated is JSON in a Ruby Hash/Array structure there'd be no risk of objects not having Marshal support. An earlier approach was to iterate ourselves, but that was a bit more verbose: ``` def deep_dup_payload(item) case item when Array item.map { |i| deep_dup_payload(i) } when Hash item.dup.transform_values { |v| deep_dup_payload(v) } else item.dup end end ``` Anecdotally it does seem that previously this project considered modifying the payload as unsupported [2], but this wasn't enforced by a frozen object and would now be a breaking change to require this. [1]: https://github.com/alphagov/content-data-api/blob/cddc744baa0a073868a527b5513a7943fa53ddc6/spec/factories/messages.rb#L19-L26 [2]: https://github.com/alphagov/govuk_schemas/blob/e1da17587b10df8ae3b29d7583c3a4881a377f97/spec/lib/random_example_spec.rb#L45-L53
kevindew
added a commit
that referenced
this pull request
May 23, 2024
The change in #113 didn't handle a common usage scenario where a user modifies the payload passed into the block (example usage [1]). In order to handle this we need to make a copy of the original payload before it is modified. As this project doesn't have an ActiveSupport dependency I wrote the equivalent of a deep_dup method. using `Marshal.load(Marshal.dump(object))`, with an expectation that given the data generated is JSON in a Ruby Hash/Array structure there'd be no risk of objects not having Marshal support. Marshal is a Ruby module that can serialise Ruby objects into a byte stream and can take a byte stream and re-create the objects. By doing this for a payload it creates new objects for the entire payload hash. This means that any modifications to the payload are not represented in the original_payload object. An earlier approach was to iterate ourselves, but that was a bit more verbose: ``` def deep_dup_payload(item) case item when Array item.map { |i| deep_dup_payload(i) } when Hash item.dup.transform_values { |v| deep_dup_payload(v) } else item.dup end end ``` Anecdotally it does seem that previously this project considered modifying the payload as unsupported [2], but this wasn't enforced by a frozen object and would now be a breaking change to require this. [1]: https://github.com/alphagov/content-data-api/blob/cddc744baa0a073868a527b5513a7943fa53ddc6/spec/factories/messages.rb#L19-L26 [2]: https://github.com/alphagov/govuk_schemas/blob/e1da17587b10df8ae3b29d7583c3a4881a377f97/spec/lib/random_example_spec.rb#L45-L53
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The call to validate a schema is quite expensive:
This updates the code to avoid making two calls to the validator when a
payload is generated with a block if the customised version is valid.
Before:
After:
The motivation for this was that the GOV.UK Chat test suite makes quite
heavy use of random schemas with customisations for checking supported
schemas. On a MBP with GOV.UK Docker this reduces the time of the
running the test suite from around ~55s to around ~35s.