Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[codegen][Python-experimental] Discriminator NPE fix, handle 'null' type, #4906 enhancements #5809

Merged
merged 104 commits into from
May 11, 2020

Conversation

sebastien-rosset
Copy link
Contributor

@sebastien-rosset sebastien-rosset commented Apr 2, 2020

  1. When processing discriminators of composed schemas, handle scenario when composed schema has 'null' type
  2. Remove extraneous characters in comments
  3. Add traversal of discriminator mappings in case oneOf child schema indirectly refers to discriminator.
  4. There is no need to duplicate the get_discriminator_class() code for every generated class.
  5. Add unit test when discriminator is not specified in payload and handle that scenario in the get_discriminator_class() function. IMO, another PR should be provided to provide a better error message when the input data does not contain the discriminator property.

This PR along these:

will resolve #4912

PR checklist

  • Read the contribution guidelines.
  • If contributing template-only or documentation-only changes which will change sample output, build the project before.
  • Run the shell script(s) under ./bin/ (or Windows batch scripts under.\bin\windows) to update Petstore samples related to your fix. This is important, as CI jobs will verify all generator outputs of your HEAD commit, and these must match the expectations made by your contribution. You only need to run ./bin/{LANG}-petstore.sh, ./bin/openapi3/{LANG}-petstore.sh if updating the code or mustache templates for a language ({LANG}) (e.g. php, ruby, python, etc).
  • File the PR against the correct branch: master, 4.3.x, 5.0.x. Default: master.
  • Copy the technical committee to review the pull request if your PR is targeting a particular programming language.

spacether and others added 27 commits March 26, 2020 09:38
@sebastien-rosset sebastien-rosset changed the title [codegen] Discriminator fix [codegen] Discriminator NPE fix, handle 'null' type Apr 2, 2020
@spacether
Copy link
Contributor

@sebastien-rosset your 3rd point Fix NPE when at least one extension is defined but not x-discriminator-value is already addressed in my PR which will land before this one, right?

@spacether
Copy link
Contributor

spacether commented May 7, 2020

  1. Can you clarify this comment. Avoid using "if we are coming". Why do you have this code?
  2. Can you point me to the source code with unit tests that exercise the above test?

This unit test uses that code path.

@spacether
Copy link
Contributor

spacether commented May 7, 2020

  1. It would be helpful to explain (or have a reference to doc elsewhere) that explains how ancestors (allOf) are implemented. In other programming languages (such as Java), there is a class hierarchy. In go-experimental, it's going in the direction of using composition. In python, the the super class is not the OAS ancestor.

How about we put that in the ModelComposed class docstring?

When one sets a property we use var_name_to_model_instances to store the value in the correct class instances + run any type checking + validation code.
When one gets a property we use var_name_to_model_instances to get the value from the correct class instances.
This allows multiple composed schemas to contain the same property with additive constraints on the value.

_composed_schemas (dict) stores the anyOf/allOf/oneOf classes
  key (str): allOf/oneOf/anyOf
  value (list): the classes in the XOf definition.
    Note: none_type can be included when the openapi document version >= 3.1.0
_composed_instances (list): stores a list of instances of the composed schemas defined in _composed_schemas. When properties are accessed in the self instance, they are returned from the self._data_store or the data stores in the instances in self._composed_schemas
_var_name_to_model_instances (dict): maps between a variable name on self and the composed instances (self included) which contain that data
  key (str): property name
  value (list): list of class instances, self or instances in _composed_instances which contain the value that the key is referring to

@sebastien-rosset
Copy link
Contributor Author

When one sets a property we use var_name_to_model_instances to store the value in the correct class instances + run any type checking + validation code.
When one gets a property we use var_name_to_model_instances to get the value from the correct class instances.
This allows multiple composed schemas to contain the same property with additive constraints on the value.

_composed_schemas (dict) stores the anyOf/allOf/oneOf classes
key (str): allOf/oneOf/anyOf
value (list): the classes in the XOf definition.
Note: none_type can be included when the openapi document version >= 3.1.0
_composed_instances (list): stores a list of instances of the composed schemas defined in _composed_schemas. When properties are accessed in the self instance, they are returned from the self._data_store or the data stores in the instances in self._composed_schemas
_var_name_to_model_instances (dict): maps between a variable name on self and the composed instances (self included) which contain that data
key (str): property name
value (list): list of class instances, self or instances in _composed_instances which contain the value that the key is referring to

I have added this as is, but it's not clear exactly where you want this. Please submit a pull request if you want to change what I have done.

Copy link
Contributor

@spacether spacether left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding those sample tests.

The remaining ask here is to add tests that require deserialization through the discriminators in a descendent (oneOf/anyOf schemas) or in an ancestor (allOf ancestor of self composed schema)
Also, ideally we will use existing sample schemas rather than adding additional
ones, because that helps keep our sample spec as small as possible.

Examining the added tests we are testing the following cases:

  1. using a discriminator map defined in a composed schema without using an ancestor or descendant's discriminator.

self is BiologyChordate and its discriminator contains Mammal, Primate, Reptile, and Hominid
so se we are using self's discriminator without using an ancestor or descendent's discriminator:

  • biology.Mammal from BiologyChordate
  • biology.Primate from BiologyChordate
  • biology.Reptile from BiologyChordate
  • biology.Hominid from BiologyChordate
    Per:
    @cached_property
    def discriminator():
        val = {
            'biology.Hominid': biology_hominid.BiologyHominid,
            'biology.Mammal': biology_mammal.BiologyMammal,
            'biology.Primate': biology_primate.BiologyPrimate,
            'biology.Reptile': biology_reptile.BiologyReptile,
        }
        if not val:
            return None
        return {'class_name': val}
  • biology.Hominid from BiologyMammal
    Per:
    @cached_property
    def discriminator():
        val = {
            'biology.Hominid': biology_hominid.BiologyHominid,
            'biology.Primate': biology_primate.BiologyPrimate,
        }
        if not val:
            return None
        return {'class_name': val}

So these tests do not exercise new code and do not demonstrate ancestor or descedant deserialization.
They are functionally equivalent to

  • test_deserialize_animal (only picks one class in cls.discriminator)
  • test_deserialize_mammal (only picks one class in cls.discriminator)
    The use the code path in get_discriminator_class line 720 where we use 1 discriminaotr map and get the class:
    used_model_class = class_name_to_discr_class.get(discr_value)

Can you remove these tests and sample schemas because they are not exercising a new code path?

Ancestor deserialization:

  • biology.Hominid from biology.Hominid
    This test does deserialize using an ancestor's discriminator, but we can already do this using the
    existing class/schema ParentPet. Can you:

Change this test so it deserializes ParentPet from ParentPet?
Remove the added biology.Hominid schema because we no longer need it?

Descendant Deserialization:

We have not yet demonstrated this yet, and we need to because we have added this feature.
We can do this by deserializing from a grandparent through a oneof child into a oneof child of that child.
So if we deserialize from a grandparent to a grandchild using oneOf relationships, we will demonstrate this.

Can you add a new schema with this relationship and add a test deserializing it?

  • For example you could add a Pig schema, which is in mammal's oneOf class list
    • Pig can OneOf define BasquePig and DanishPig, all of with include the className property
    • Then deserializing a BasquePig payload from a mammal data_type will demonstrate descendent deserialization because BasquePig is not in mammal's discriminator map. It is in Pig's discriminator map.

used_model_class = None
if discr_name in model_class.discriminator:
class_name_to_discr_class = model_class.discriminator[discr_name]
used_model_class = class_name_to_discr_class.get(discr_value)
Copy link
Contributor

@spacether spacether May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not yet handling the use case where discr_name is not in model_class.discriminator here.
If we hit that case, then the discriminator propertyName is different than what we are looking for and we should return None early. Here, can you add:

else:
    # the discr_name is not what we are looking for, we did not find the class that we are looking for
    return None

Or maybe we should make it the clearer:

if discr_name not in model_class.discriminator:
    # the discr_name is not what we are looking for, we did not find the class that we are looking for
    return None
class_name_to_discr_class = model_class.discriminator[discr_name]
used_model_class = class_name_to_discr_class.get(discr_value)

@sebastien-rosset
Copy link
Contributor Author

Thank you for adding those sample tests.

The remaining ask here is to add tests that require deserialization through the discriminators in a descendent (oneOf/anyOf schemas) or in an ancestor (allOf ancestor of self composed schema)

There is a ancestor relationship with 'allOf' in chordate, primate, mammal, hominid and the unit test is exercising this, I don't understand why you are stating the tests do not include ancestors. Also see below some additional details.

Also, ideally we will use existing sample schemas rather than adding additional
ones, because that helps keep our sample spec as small as possible.

Can you provide a specific example that you want? You are very opinionated about how things should be done, can you please provide a specific relationship using the existing spec? Otherwise I think it'll keep going for a long time.

Examining the added tests we are testing the following cases:

  1. using a discriminator map defined in a composed schema without using an ancestor or descendant's discriminator.

self is BiologyChordate and its discriminator contains Mammal, Primate, Reptile, and Hominid
so se we are using self's discriminator without using an ancestor or descendent's discriminator:

  • biology.Mammal from BiologyChordate
  • biology.Primate from BiologyChordate
  • biology.Reptile from BiologyChordate
  • biology.Hominid from BiologyChordate
    Per:
    @cached_property
    def discriminator():
        val = {
            'biology.Hominid': biology_hominid.BiologyHominid,
            'biology.Mammal': biology_mammal.BiologyMammal,
            'biology.Primate': biology_primate.BiologyPrimate,
            'biology.Reptile': biology_reptile.BiologyReptile,
        }
        if not val:
            return None
        return {'class_name': val}
  • biology.Hominid from BiologyMammal
    Per:
    @cached_property
    def discriminator():
        val = {
            'biology.Hominid': biology_hominid.BiologyHominid,
            'biology.Primate': biology_primate.BiologyPrimate,
        }
        if not val:
            return None
        return {'class_name': val}

So these tests do not exercise new code and do not demonstrate ancestor or descedant deserialization.

I'm not sure why you are saying this. I modeled the schemas to address the missing recursion in the existing code when the discriminators are accessible through a traversal of the ancestors. The way I tested locally is I wrote the unit tests, wrote the code until the unit test passed, then I disabled the newly-introduced recursion in get_discriminator_class(), and the unit test failed.
When I disable the recursion, the code is as it was before my proposed changes. So the fact that UT outcome changes based on the recursion is a proof that the new code is exercised. Also, during debugging I added some print statements and I could see the new code was executed.

I will wait for you response before responding to your points below.

They are functionally equivalent to

  • test_deserialize_animal (only picks one class in cls.discriminator)
  • test_deserialize_mammal (only picks one class in cls.discriminator)
    The use the code path in get_discriminator_class line 720 where we use 1 discriminaotr map and get the class:
    used_model_class = class_name_to_discr_class.get(discr_value)

Can you remove these tests and sample schemas because they are not exercising a new code path?

Ancestor deserialization:

  • biology.Hominid from biology.Hominid
    This test does deserialize using an ancestor's discriminator, but we can already do this using the
    existing class/schema ParentPet. Can you:

Change this test so it deserializes ParentPet from ParentPet? Remove the added biology.Hominid schema because we no longer need it?

Descendant Deserialization:

We have not yet demonstrated this yet, and we need to because we have added this feature.
We can do this by deserializing from a grandparent through a oneof child into a oneof child of that child.
So if we deserialize from a grandparent to a grandchild using oneOf relationships, we will demonstrate this.

Can you add a new schema with this relationship and add a test deserializing it?

  • For example you could add a Pig schema, which is in mammal's oneOf class list

    • Pig can OneOf define BasquePig and DanishPig, all of with include the className property
    • Then deserializing a BasquePig payload from a mammal data_type will demonstrate descendent deserialization because BasquePig is not in mammal's discriminator map. It is in Pig's discriminator map.

@spacether
Copy link
Contributor

spacether commented May 11, 2020

Thank you for describing what is going on in those unit tests more. I was wrong in saying that those first examples didn't use ancestor or descendent inheritance. They do use ancestor inheritance. Sorry about that. I was confused about what was going on because it looked like we were only moving through one discriminator map.

First they move through the single discriminator in Chordate to select new_cls. Then they check the discriminator in the chosen new_cls and use the ancestor allOf relationship to decide to deserialize using new_cls.

All of these tests demonstrate ancestor inheritance:

  1. biology.Mammal from BiologyChordate
  2. biology.Primate from BiologyChordate
  3. biology.Reptile from BiologyChordate
  4. biology.Hominid from BiologyChordate
  5. biology.Hominid from BiologyMammal
  6. biology.Hominid from Hominid

Let's continue this discussion on our call later.
Here is the get_discriminator_class function with specific variables for descendant and ancestor classes which may help with our discussion:

def get_discriminator_class(model_class,
                            discr_name,
                            discr_value, cls_visited):
    """Returns the child class specified by the discriminator.

    Args:
        model_class (OpenApiModel): the model class.
        discr_name (string): the name of the discriminator property.
        discr_value (any): the discriminator value.
        cls_visited (list): list of model classes that have been visited.
            Used to determine the discriminator class without
            visiting circular references indefinitely.

    Returns:
        used_model_class (class/None): the chosen child class that will be used
            to deserialize the data, for example dog.Dog.
            If a class is not found, None is returned.
    """

    if model_class in cls_visited:
        # The class has already been visited and no suitable class was found.
        return None
    cls_visited.append(model_class)
    used_model_class = None
    if discr_name in model_class.discriminator:
        class_name_to_discr_class = model_class.discriminator[discr_name]
        used_model_class = class_name_to_discr_class.get(discr_value)
    if used_model_class is None:
        # We didn't find a discriminated class in class_name_to_discr_class.
        # The discriminator mapping may exist in a descendant (anyOf, oneOf)
        # or ancestor (allOf).
        # Ancestor example: in the "Dog -> Mammal -> Chordate -> Animal"
        #   hierarchy, the discriminator mappings may be defined at any level
        #   in the hieararchy.
        # Descendant example: a schema is oneOf[Plant, Mammal], and each
        #   oneOf child may itself be an allOf with some arbitrary hierarchy,
        #   and a graph traversal is required to find the discriminator.
        descendant_classes = (
            model_class._composed_schemas.get('oneOf', ()) +
            model_class._composed_schemas.get('anyOf', ())
        )
        ancestor_classes = model_class._composed_schemas.get('allOf', ())
        possible_classes = (
            descendant_classes +
            ancestor_classes
        )
        for cls in possible_classes:
            # Check if the schema has inherited discriminators.
            if cls.discriminator is not None:
                used_model_class = get_discriminator_class(
                                    cls, discr_name, discr_value, cls_visited)
                if used_model_class is not None:
                    return used_model_class
    return used_model_class

@spacether
Copy link
Contributor

Further work will be needed to resolve #4912. That work will be done at a later time.

Copy link
Contributor

@spacether spacether left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.
From here on I will focus on not adding scope creep like I did here, and providing more specific feedback withe code snippets and branches to show what I mean.
Thank you for all of your hard work on this.

@spacether spacether merged commit c1d47a5 into OpenAPITools:master May 11, 2020
@spacether spacether added this to the 5.0.0 milestone May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Python] Multilevel inheritance discriminator support
4 participants