Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Schema Validation #2238

Merged
merged 1 commit into from
May 26, 2021
Merged

JSON Schema Validation #2238

merged 1 commit into from
May 26, 2021

Conversation

alexandre-abrioux
Copy link
Contributor

@alexandre-abrioux alexandre-abrioux commented Oct 31, 2020

Q A
Type feature
BC Break no
Fixed issues #2233

Summary

This pull request introduces a new @Validation class annotation containing three options:

  • jsonSchema
  • action
  • level

; corresponding to the three options of MongoDB collections:

The JSON Extension

The MongoDB Driver requires an array for the validator option of the createCollection method and collMod command.

We therefore have to decode the user-defined JSON schema, thus the need to add the ext-json to the require-dev section of the composer.json.

The ext-json is still not required for users to install the package, it is only added to the suggestion list.

Adding a new Validation annotation vs. adding new options to the Document annotation

When I started working on this feature I chose to add three new properties to the @Document annotation:

  • validationJsonSchema
  • validationAction
  • validationLevel

But I found that it would be convenient for the validationAction and validationLevel options to be enumerations as it would help the user to realize directly during the mapping process that they made a mistake in filling those.

A wrong value would trigger doctrine/annotations's inner logic and raise an AnnotationException.

When used with Symfony for instance users wouldn't have to wait until execution of odm:schema:update command to realize they made a mistake, they would know right away on cache clear which happens on every request by default on dev environment.

Using enumerations was only possible by creating a whole new annotation class as the @Document annotation has a constructor and therefore its properties' annotations are not parsed, see https://github.com/doctrine/annotations/blob/e2623fd97c136cc7a18b467da8d8331c01de051c/lib/Doctrine/Common/Annotations/DocParser.php#L548

To sum up adding @Enum on the annotation properties was only possible by creating a new class, thus the choice of a new @Validation annotation. It makes the annotation-driven configuration similar to the XML-driven configuration which also contains enumerations.

The validationAction / validationLevel default values issue

MongoDB version ≥ 4.4.0 allows to reset the validationAction and validationLevel options of the collections to their default values by passing empty strings to the respective arguments of the collMod command.

However I found that versions 4.2.0 and bellow do not care about empty strings in the validationAction / validationLevel arguments of the collMod command. The command options will just get ignored and the collection options will not be modified.

In order to reset those options during odm:schema:update, in case for instance the user removed the action / level options of the @Validation annotation, we need to have default values defined for those options in Doctrine.

Those default values are documented and are equal the default values defined by MongoDB.

However if the MongoDB team decides to change those default values in the future it could cause confusion for our users, but it's the only solution I found to be able to support MongoDB versions < 4.4.0.

Documentation

I updated the annotations-reference and added a new section on the validation-of-documents cookbook, I hope that's OK. Feel free to proofread it and recommend some adjustments as English is not my mother tong.

TODO

  • Annotation
  • AnnotationDriver
  • XmlDriver
  • ClassMetadata
  • SchemaManager
  • UpdateCommand
  • Tests: Annotation
  • Tests: AnnotationDriver
  • Tests: XmlDriver
  • Tests: ClassMetadata
  • Tests: SchemaManager
  • Tests : Functionnal
  • Tests : UpdateCommand
  • Documentation
  • Clean Commit History

@alexandre-abrioux alexandre-abrioux marked this pull request as draft October 31, 2020 02:37
@alexandre-abrioux alexandre-abrioux changed the title WIP - Collection validator WIP - Collection Validation Nov 2, 2020
@alexandre-abrioux alexandre-abrioux changed the title WIP - Collection Validation WIP - JSON Schema Validation Nov 2, 2020
@malarzm
Copy link
Member

malarzm commented Nov 4, 2020

Just wanted to say thanks for a really well-thought PR! It's great you have already included documentation. Looking forward to leaving the "draft" state :)

@malarzm
Copy link
Member

malarzm commented Nov 4, 2020

I let myself go through the review initially, hope you don't mind :)

@alexandre-abrioux
Copy link
Contributor Author

@malarzm Thanks ! I just wanted to increase the coverage before leaving the draft stage, but overall it was almost ready. I don't have much time this week but I will look into your review very soon. Thank you again :)

@alcaeus alcaeus self-requested a review November 9, 2020 15:54
@alexandre-abrioux alexandre-abrioux marked this pull request as ready for review November 15, 2020 16:59
@alexandre-abrioux alexandre-abrioux changed the title WIP - JSON Schema Validation JSON Schema Validation Nov 15, 2020
@alexandre-abrioux
Copy link
Contributor Author

Hi, thanks again @malarzm for the draft review.
I took your advice into consideration, made the according changes and edited the documentation to reflect those modifications.
Please take a look again at the PR description because it was updated to explain some of the choices I made.
Some more tests were added to increase coverage.
It should now be ready to be reviewed :)
Thank you.

Copy link
Member

@alcaeus alcaeus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, thank you for taking the time to work on such a feature, I really appreciate the thought you've put into this.

I've left a few comments below from a first reading I did a little while back, but wanted to also discuss how we can make use of this feature. I see two general ways on how users might want their schema definitions:

  • Set manually as implemented in this PR
  • Automatically determined based upon mapping

The second is definitely more complex and IMO way outside the scope of the implementation. However, I believe that we should keep it in mind when defining how to map the feature. The current mapping might work with an automatic JSON schema generation, for example by omitting the jsonSchema property of the annotation, or by omitting the validation-json-schema tag in XML mappings. I don't think there's anything to change, but I wanted to bring it up if only to solicit feedback.

Speaking of annotations, I'm a little bit worried about using a JSON string in annotations. I know the idea is to be able to copy/paste the schema from somewhere else, but this point is moot if people need to quote it. Looking at it, JSON schema is (almost) compatible with the annotations, so that you could do the following:

/**
 * @ODM\Document
 * @ODM\Validation(
 *     jsonSchema={
 *          "required": {"name"},
 *          "properties": {
 *              "name": {
 *                  "bsonType": "string",
 *                  "description": "must be a string and is required"
 *              }
 *          }
 *      },
 *     action=ClassMetadata::SCHEMA_VALIDATION_ACTION_WARN,
 *     level=ClassMetadata::SCHEMA_VALIDATION_LEVEL_MODERATE,
 * )
 */
class JsonSchemaValidated

The only change needed to make the JSON schema valid for an annotation is to change square brackets for arrays to curly brackets, as doctrine/annotations (currently) does not support using square brackets for arrays (this was suggested in doctrine/annotations#122 but closed as 2.0 was thought to make this easier but it never materialised).

However, the advantage of pasting the schema in annotations is moot if we look at PHP 8 attributes (which will eventually replace annotations):

#[ODM\Document()]
#[ODM\Validation(
    jsonSchema=[
        "required" => ["name"],
        "properties" => [
            "name" => [
                "bsonType" => "string",
                "description" => "must be a string and is required"
            ]
        ]
    ],
    action=ClassMetadata::SCHEMA_VALIDATION_ACTION_WARN,
    level=ClassMetadata::SCHEMA_VALIDATION_LEVEL_MODERATE,
])
class JsonSchemaValidated

In that case, pasting a string may be easier since we can single-quote it (or maybe even use HEREDOC?). Just wanted to check if you've given this some thought to make it a bit more user-friendly.

One concern I did have was that any validation errors may lead to a broken UnitOfWork that could then cause issues. This would be similar to an issue we recently found if we were to implement transaction support, but on second thought this doesn't seem to be an issue any more than it already is: a validation error would be comparable to a duplicate key exception, leading to a situation we already know and are aware of. I'll do another review of this PR and focus on test cases to make sure everything is covered.

lib/Doctrine/ODM/MongoDB/Mapping/ClassMetadata.php Outdated Show resolved Hide resolved
lib/Doctrine/ODM/MongoDB/Mapping/Driver/XmlDriver.php Outdated Show resolved Hide resolved
lib/Doctrine/ODM/MongoDB/SchemaManager.php Outdated Show resolved Hide resolved
@alexandre-abrioux
Copy link
Contributor Author

alexandre-abrioux commented Nov 30, 2020

@alcaeus Thank you very much for that detailed PR review. I will work on your suggestions as soon as I get some time off during the following days.

I agree, we should definitely keep in mind the fact that we could generate that schema from metadata. That's what I was resolved to try and do when I opened #2233. But it was for sure a good suggestion by @malarzm to implement it manually first.

About the annotation's specs, this is a tough choice.

I actually started to implement this feature the "array way" first (without strings). I chose to fallback to the "string way" in the process of implementing it as I found some drawbacks. I'll try to summarize my thought process with a table (more details below):

spec advantages inconveniences
array - No need for json_decode in AnnotationDriver. - XMLDriver cannot be implemented in a similar way: ext-json is still needed in the end.
- With PHP8 we will need to transform the JSON schema to PHP arrays. It is not as easy as a simple search/replace but there are online tools that can take care of it (exemple).
- With PHP7 and below we need to replace square brackets to curly braces for arrays, complexity increases with the one of the JSON schema.
- If we extend, in the future, the validation implementation to the whole MongoDB validator, not just the $jsonSchema part of it, we will need more types than just the JSON valid data types (ex. regex) and thus it won't be possible to implement it with PHP arrays. But this will probably be done in another annotation anyway.
string - Implementation is similar in AnnotationDriver and XMLDriver.
- With PHP8 Attributes and HEREDOC we will probably not need to escape characters anymore.
- Need for ext-json.
- With PHP7 and below we need to escape double quotes, but it is as simple as search/replace all double quotes in the JSON schema.

The "array way"

It would be great to implement it this way. As you said:

JSON schema is (almost) compatible with the annotations

; and we wouldn't need to json_decode the schema to PHP arrays. We could get rid of the dependency between this feature and ext-json.

However I found that this dependency was still needed for the XMLDriver as it would be terrible to ask users to write their JSON schema directly as an XML tree. Plus we would need to use some kind of <xs:any> hack in the doctrine-mongo-mapping.xsd which I didn't feel comfortable with.

In the end the XMLDriver implementation and the AnnotationDriver implementation would have needed to be different which I found kind of confusing for the user.

Moreover, replacing square brackets with curly braces for doctrine/annotations (PHP7 and below) is not an easy task that could be achieved with a simple search/replace as square brackets in JSON string values shouldn't be replaced, for instance:

{
    "regex": "[0-9]+"
}

should not be transformed to:

{
    "regex": "{0-9}+"
}

The "string way"

It has a major flaw: we need to escape double quotes. However it is as simple as a search/replace so it is acceptable in my opinion.

Plus, as you mentioned, we probably won't need to escape in PHP8 anymore, which is great. I haven't got to test the new PHP8 Attributes still tho so I'm not 100% sure that they accept HEREDOC, but I guess they should.

$jsonSchema vs. whole validator

Furthermore, but this is probably far-fetched, I was trying at that time to interface the whole validator component of MongoDB with doctrine, not just the $jsonSchema part of it. Citing MongoDB docs:

In addition to JSON Schema validation that uses the $jsonSchema query operator, MongoDB supports validation with other query operators...

For instance:

{
    $jsonSchema: { ... },
    $or: [
         { phone: { $type: "string" } },
         { email: { $regex: /@mongodb\.com$/ } },
         { status: { $in: [ "Unknown", "Incomplete" ] } }
    ]
}

This could be done in doctrine if we decide to extend the feature it in the future. However those validators are not JSON typed. As I understand it they could be composed of any Javascript type (see the $regex operator in this example).

The validation would have to be passed as a string in the annotation (no other choice here). This is when I fell back to the "string way".

But the validation content would also need to be converted to PHP with a Javascript object parser of some sort... That's when I understood it wasn't an easy task and I should focus on $jsonSchema first. But I didn't revert my "string way" choice since then.

Anyways, that's how I decided to use strings in the end. Sorry for the long explanation, just wanted to expose my whole thought process.

But it could still be reverted, you tell me :)

Validation errors

I actually didn't think too much about it as I thought the exception raised by the mongodb driver was enough, like a unique index exception. Feel free to guide me if there is some test I could add to make sure that this is enough.

Thank you again for taking the time to review this PR!

@alexandre-abrioux
Copy link
Contributor Author

branch rebased on master

@alexandre-abrioux
Copy link
Contributor Author

rebased (fixed conflicts)

@alcaeus
Copy link
Member

alcaeus commented Jan 21, 2021

@alexandre-abrioux sorry for the delay here. I'll revisit this PR in the coming days.

@alcaeus alcaeus self-requested a review January 21, 2021 11:26
@alexandre-abrioux
Copy link
Contributor Author

@alcaeus Thanks! Just finished rebasing to upstream, I got an error on the sharded_cluster pipeline and I'm having a hard time figuring out what could be the cause. I understand it's a new pipeline you added this week in #2272. Could you check it out? I'm sure you can guide me into the right direction to fix it! Thanks again :)

@alcaeus
Copy link
Member

alcaeus commented Jan 21, 2021

I got an error on the sharded_cluster pipeline and I'm having a hard time figuring out what could be the cause.

Taking a short look I wasn't able to reproduce this locally, but I'll take another look when I return to review this.

Base automatically changed from master to 2.3.x February 2, 2021 21:41
@alexandre-abrioux
Copy link
Contributor Author

@alcaeus rebased the branch off of 2.3.x, and edited the doc:

This feature has been introduced in version 2.2.0

to

This feature has been introduced in version 2.3.0

Tests are green again :)

Copy link
Member

@alcaeus alcaeus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, I'm really sorry for the time it has taken me to review this. Second of all, thank you so much for sticking with this and not giving up on me. Third of all, thank you for taking the time to add tests for the update command which was previously untested.

From a functional standpoint this PR is good to go, but I still have some reservations about the mapping. I'm not convinced that the annotation mapping is the best we can do, and I also think that we can make some minor improvements to the XML mapping to make it a little more contained. I'd appreciate your opinion on the suggestions, even if it seems like we're running in circles.

One more note, while looking through the changes I realised that this PR requires people to incorporate the schema for embedded documents into the documents that embed them. While this is not entirely optimal, I believe this is a sane choice for this feature, as it exposes low-level collection options. As a follow-up feature, we could think about a generator that is able to generate validation rules from other metadata (e.g. constraints from symfony/validator) and import validation rules for embedded documents.

docs/en/cookbook/validation-of-documents.rst Outdated Show resolved Hide resolved
docs/en/cookbook/validation-of-documents.rst Outdated Show resolved Hide resolved
docs/en/cookbook/validation-of-documents.rst Outdated Show resolved Hide resolved
docs/en/cookbook/validation-of-documents.rst Outdated Show resolved Hide resolved
docs/en/reference/annotations-reference.rst Outdated Show resolved Hide resolved
docs/en/reference/annotations-reference.rst Outdated Show resolved Hide resolved
docs/en/cookbook/validation-of-documents.rst Outdated Show resolved Hide resolved
lib/Doctrine/ODM/MongoDB/SchemaManager.php Outdated Show resolved Hide resolved
@alexandre-abrioux
Copy link
Contributor Author

alexandre-abrioux commented Mar 3, 2021

Hi @alcaeus,

Thank you for taking the time to review this, and for the very thorough and detailed response.

I am going to implement the changes, but before I start digging, I have a suggestion regarding your point about the annotations.
You're right: it doesn't make sense to place those long schemas inside them.

But I think we overlooked something. After re-reading all your comments, I thought later on: why not do it like this?

/**
 * @ODM\Document
 * @ODM\Validation(
 *     jsonSchema=JsonSchemaValidated::JSON_SCHEMA,
 *     action=ClassMetadata::SCHEMA_VALIDATION_ACTION_WARN,
 *     level=ClassMetadata::SCHEMA_VALIDATION_LEVEL_MODERATE,
 * )
 */
class JsonSchemaValidated
{

    public const JSON_SCHEMA = <<<'EOT'
{
    "required": ["name"],
    "properties": {
        "name": {
            "bsonType": "string",
            "description": "must be a string and is required"
        }
    }
}
EOT;

    /** @ODM\Id */
    private $id;

    /** @ODM\Field(type="string") */
    private $name;
}

I'm really convinced it's the best of both worlds. We would get the same experience with both the XML driver and the Annotation driver: just copy-paste your schema, and you're done!

I just quickly tested it in the test suite and that works. What do you think about this?

@alcaeus
Copy link
Member

alcaeus commented Mar 4, 2021

You're right: it doesn't make sense to place those long schemas inside them.

But I think we overlooked something. After re-reading all your comments, I thought later on: why not do it like this?
[...]

I'm really convinced it's the best of both worlds. We would get the same experience with both the XML driver and the Annotation driver: just copy-paste your schema, and you're done!

I just quickly tested it in the test suite and that works. What do you think about this?

I like it - this could also be modified to include both worlds: if the schema given to the annotation is a string, assume it's JSON and try to decode it. Afterwards, we're always expecting a valid schema. This would allow people to either copy-paste the schema from an external source without any modification at all using HEREDOC, or if they want to create the schema by writing up a PHP array they can do that as well. Feel free to implement only one of these methods, I'd be happy to expanding support in a later PR.

@alexandre-abrioux
Copy link
Contributor Author

Hi @alcaeus

I'm very sorry for the time it took me to take action after our last conversation, I got caught in other projects.

I finally implemented the requested changes:

  • added your edits to the documentation ;
  • modified the XML way of providing validation constraints, like so:
    <document name="TestDocuments\JsonSchemaValidatedDocument">
        <json-schema-validation action="warn" level="moderate">
            {
                "required": ["name"],
                "properties": {
                    "name": {
                        "bsonType": "string",
                        "description": "must be a string and is required"
                    }
                }
            }
        </json-schema-validation>
    </document>
  • removed the test of the database version before updating the collection ;
  • used HEREDOC in class constants for the JSON schema, like so:
/**
 * @ODM\Document
 * @ODM\Validation(
 *     jsonSchema=JsonSchemaValidated::JSON_SCHEMA,
 *     action=ClassMetadata::SCHEMA_VALIDATION_ACTION_WARN,
 *     level=ClassMetadata::SCHEMA_VALIDATION_LEVEL_MODERATE,
 * )
 */
class JsonSchemaValidated
{

    public const JSON_SCHEMA = <<<'EOT'
{
    "required": ["name"],
    "properties": {
        "name": {
            "bsonType": "string",
            "description": "must be a string and is required"
        }
    }
}
EOT;

    /** @ODM\Id */
    private $id;

    /** @ODM\Field(type="string") */
    private $name;
}
  • and rebased the PR to the latest 2.3.x branch commit.

I only implemented the JSON schema provided by the the user as a HEREDOC string. As you stated before, we can expand the feature to a schema provided as a PHP array in a later PR.

Feel free to review the code one last time if needed! Thank you again for the time spent on this :)

Copy link
Member

@alcaeus alcaeus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work @alexandre-abrioux, and thank you for your patience with me while I struggled to find time to review this. I'll release this in the coming days. I really appreciate the time you took to build this great feature!

@alcaeus alcaeus added this to the 2.3.0 milestone May 19, 2021
@alcaeus alcaeus self-assigned this May 19, 2021
@jmikola
Copy link
Member

jmikola commented May 20, 2021

@alcaeus pinged me to take a look at this the other day and I just got through reading the PR from top to bottom.

The first thing that stood out to me was that this was focused only on $jsonSchema and should probably more generically support defining the validator document in its entirety. $jsonSchema is just a single query operator and from a technical standpoint I don't see any reason the feature can't support a full query (e.g. mixing $jsonSchema with $or) out of the gate.

I see this was previously alluded to in #2238 (comment):

Furthermore, but this is probably far-fetched, I was trying at that time to interface the whole validator component of MongoDB with doctrine, not just the $jsonSchema part of it. Citing MongoDB docs:

In addition to JSON Schema validation that uses the $jsonSchema query operator, MongoDB supports validation with other query operators...

For instance:

{
   $jsonSchema: { ... },
   $or: [
        { phone: { $type: "string" } },
        { email: { $regex: /@mongodb\.com$/ } },
        { status: { $in: [ "Unknown", "Incomplete" ] } }
   ]
}

This could be done in doctrine if we decide to extend the feature it in the future. However those validators are not JSON typed. As I understand it they could be composed of any Javascript type (see the $regex operator in this example).

The Javascript types you're seeing there (e.g. regex) are really just due to documentation using the MongoDB shell for its example. If this page showed examples in multiple languages/drivers (as some others do), we might expect to see native language types used (or classes like MongoDB\BSON\Regex.

As a general rule, I think it makes sense to use Extended JSON whenever we're asking users to provide JSON that needs to be converted to MongoDB data. That syntax is both valid JSON and capable of expressing all special BSON types.

With respect to this PR, I think it'd make sense to rename "json-schema-validation" to "schema-validation" (consistent with the MongoDB feature name) and then solicit the following options:

  • validator as a document
  • validationAction (or just action) as a string
  • validationLevel (or just level) as a string

This would require anyone using a JSON schema to provide the top-level $jsonSchema operator for validator on their own, but I think that's a small ask (easily communicated in documentation) for having this feature support all aspects of MongoDB's schema validation.

@alcaeus
Copy link
Member

alcaeus commented May 21, 2021

Ah, that's a good point. @alexandre-abrioux I can make those changes if you want so you don't have to do another pass. Using extended JSON for this makes sense, as it saves us from re-declaring a good amount of syntax in the XSD (like I did for partial index filters).

@alcaeus alcaeus mentioned this pull request May 21, 2021
@alexandre-abrioux
Copy link
Contributor Author

alexandre-abrioux commented May 21, 2021

@jmikola Awesome! That is exactly what I had in mind when I first though about this feature, but I had no knowledge of the extended JSON specification, so couldn't wrap my head around how to solve this easily!

@alcaeus I'll take a shot at it right away! I'd love contribute and finish this feature if I can.

Just to be clear, this is what I'm headed to, using MongoDB's documentation example and using the extended JSON specification for the regex part. Please confirm me that's what you had in mind (also consider the annotation name and properties) and I'll make the changes.

For the XML part:

    <document name="TestDocuments\SchemaValidatedDocument">
        <schema-validation action="warn" level="moderate">
          {
            "$jsonSchema": { ... }
            "$or": [
              { "phone": { "$type": "string" } },
              { "email": { "$regex": { "$regularExpression" : { "pattern": "@mongodb\.com$", "options": "" } } } },
              { "status": { "$in": [ "Unknown", "Incomplete" ] } }
            ]
          }
        </schema-validation>
    </document>

For the annotation part:

/**
 * @ODM\Document
 * @ODM\Validation(
 *     validator=SchemaValidated::VALIDATOR,
 *     action=ClassMetadata::SCHEMA_VALIDATION_ACTION_WARN,
 *     level=ClassMetadata::SCHEMA_VALIDATION_LEVEL_MODERATE,
 * )
 */
class SchemaValidated
{

    public const VALIDATOR = <<<'EOT'
{
  "$jsonSchema": { ... }
  "$or": [
    { "phone": { "$type": "string" } },
    { "email": { "$regex": { "$regularExpression" : { "pattern": "@mongodb\\.com$", "options": "" } } } },
    { "status": { "$in": [ "Unknown", "Incomplete" ] } }
  ]
}
EOT;

   // rest of the class code...
}

@alcaeus
Copy link
Member

alcaeus commented May 21, 2021

Yep, that looks lovely!

@jmikola if I'm not mistaken we should use BSON\toPHP(BSON\fromJson($validator)) ([see docs])(https://www.php.net/manual/en/ref.bson.functions.php) to avoid reading plain old JSON, right? BSON types are safe to serialise in Metadata, and this would give the driver the correct representation for whatever the user gives us. We need to make sure to link the extended JSON documentation in the docs here as well.

@alexandre-abrioux
Copy link
Contributor Author

Implemented the changes, looks good to me! Notes:

  • The schema is parsed with BSON\fromJSON in the driver, so that the user could directly realize during the mapping process if they made a mistake creating their class' schema.
  • It is therefore a simple string that's being saved to the ClassMetaData::validator property: a serialized BSON document represented as a binary string.
  • The schema is then converted to its PHP representation with BSON\toPhp in the SchemaManager.

Don't hesitate to make edits the documentation if needed. Thanks!

@alcaeus alcaeus self-requested a review May 22, 2021 07:24
@jmikola
Copy link
Member

jmikola commented May 25, 2021

if I'm not mistaken we should use BSON\toPHP(BSON\fromJson($validator)) ([see docs])(https://www.php.net/manual/en/ref.bson.functions.php) to avoid reading plain old JSON, right?

Correct. fromJSON is only going to yield a binary string, so we need toPHP to decode the binary string back to a PHP value representing a document (e.g. stdClass).

It is therefore a simple string that's being saved to the ClassMetaData::validator property: a serialized BSON document represented as a binary string.

I think it may make more sense to immediately chain fromJson() with toPHP() and only ever store the PHP value in ClassMetadata. In the event the user ever dumps ClassMetadata, the binary string may ultimately be more confusing. Also, IIRC there is mechanism for defining mappings purely with PHP (as opposed to annotations or XML), which entails explicitly constructing and configuring ClassMetadata objects. Sticking to familiar PHP types would help with that (admittedly uncommon) use case.

@alexandre-abrioux
Copy link
Contributor Author

I think it may make more sense to immediately chain fromJson() with toPHP() and only ever store the PHP value in ClassMetadata. In the event the user ever dumps ClassMetadata, the binary string may ultimately be more confusing. Also, IIRC there is mechanism for defining mappings purely with PHP (as opposed to annotations or XML), which entails explicitly constructing and configuring ClassMetadata objects. Sticking to familiar PHP types would help with that (admittedly uncommon) use case.

@jmikola Thanks, it makes sense! I was afraid the PHP value wouldn't get serialized properly in ClassMetadata:__sleep method, but I tested it and it works perfectly.

I made the appropriate edits. However the static analysis test seems to fail now, but on part of the code I didn't modify. Would it be related to the recent upgrade of doctrine/coding-standard?

@alcaeus
Copy link
Member

alcaeus commented May 26, 2021

the static analysis test seems to fail now, but on part of the code I didn't modify

I'll take a look and fix this before merging. Again, thanks for your work here!

@alcaeus alcaeus merged commit 689cb5b into doctrine:2.3.x May 26, 2021
@alexandre-abrioux alexandre-abrioux deleted the iss-2233 branch May 26, 2021 08:34
@jmikola
Copy link
Member

jmikola commented May 26, 2021

I was afraid the PHP value wouldn't get serialized properly in ClassMetadata:__sleep method, but I tested it and it works perfectly.

Serialization support for the driver's BSON classes was implemented in 1.2.0 (PHPC-460), so there should be nothing to worry about there. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants