Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenApi incorrect inline schema for multi-level objects #57799

Closed
1 task done
dnv-kimbell opened this issue Sep 11, 2024 · 9 comments
Closed
1 task done

OpenApi incorrect inline schema for multi-level objects #57799

dnv-kimbell opened this issue Sep 11, 2024 · 9 comments
Assignees
Labels
area-minimal Includes minimal APIs, endpoint filters, parameter binding, request delegate generator etc feature-openapi
Milestone

Comments

@dnv-kimbell
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

We are looking into replacing Swashbuckle with the new OpenApi functionality. We have created a sample application to see what is different in the generated schema. I discovered one difference I can't explain.

We have a controller that uses this object structure

public class EchoData
{
    public EchoData2? Level2 { get; set; }
}

public class EchoData2
{
    public EchoData3? Level3 { get; set; }
}

public class EchoData3
{
    public string Something { get; set; } = default!;
}

In the generated json, the EchoData3 is inlined into EchoData2. This doesn't seem right.

 "components": {
        "schemas": {
            "EchoData": {
                "type": "object",
                "properties": {
                    "level2": {
                        "$ref": "#/components/schemas/EchoData2"
                    }
                }
            },
            "EchoData2": {
                "type": "object",
                "properties": {
                    "level3": {
                        "type": "object",
                        "properties": {
                            "something": {
                                "type": "string"
                            }
                        },
                        "nullable": true
                    }
                },
                "nullable": true
            }
        }
    },

Expected Behavior

I would expect a schema entry for every class being used. When using code generation tools, naming this inline object becomes tricky.

Steps To Reproduce

Check out code at: https://github.com/dnv-kimbell/openapi-inlineschema

Exceptions (if any)

No response

.NET Version

9.0 RC1

Anything else?

No response

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label Used by the dotnet-issue-labeler to label those issues which couldn't be triaged automatically label Sep 11, 2024
@martincostello martincostello added feature-openapi area-minimal Includes minimal APIs, endpoint filters, parameter binding, request delegate generator etc and removed needs-area-label Used by the dotnet-issue-labeler to label those issues which couldn't be triaged automatically labels Sep 11, 2024
@captainsafia
Copy link
Member

@dnv-kimbell Thanks for reporting this issue! I believe this bug is caused by the level of recursion that we apply when mapping schemas by references in this bit of code. This particularly happens if you reference a type in a deeply nested structure and it is only referenced once in the entire document.

Let me know if the hierarchy that you're dealing with matches these properties and I can look at expanding our test suite with this and adding a fix.

@captainsafia captainsafia added the Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. label Sep 11, 2024
@dnv-kimbell
Copy link
Author

This is not recursion, the class names just happen to be named similarly. If I add extra types with very different names, they are still inlined.

To me it looks like it has problems handling deep object structures; first level using reference is ok, rest is inlined.

 "components": {
        "schemas": {
            "EchoData": {
                "type": "object",
                "properties": {
                    "level2": {
                        "$ref": "#/components/schemas/EchoData2"
                    }
                }
            },
            "EchoData2": {
                "type": "object",
                "properties": {
                    "level3": {
                        "type": "object",
                        "properties": {
                            "something": {
                                "type": "string"
                            },
                            "topic": {
                                "type": "object",
                                "properties": {
                                    "name": {
                                        "type": "string"
                                    },
                                    "description": {
                                        "type": "string"
                                    }
                                },
                                "nullable": true
                            },
                            "customer": {
                                "type": "object",
                                "properties": {
                                    "name": {
                                        "type": "string"
                                    },
                                    "address": {
                                        "type": "object",
                                        "properties": {
                                            "street": {
                                                "type": "string",
                                                "nullable": true
                                            },
                                            "city": {
                                                "type": "string",
                                                "nullable": true
                                            }
                                        }
                                    }
                                },
                                "nullable": true
                            }
                        },
                        "nullable": true
                    }
                },
                "nullable": true
            }
        }
    },

@dotnet-policy-service dotnet-policy-service bot added Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. and removed Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. labels Sep 12, 2024
@dnv-kimbell
Copy link
Author

dnv-kimbell commented Sep 12, 2024

I did some more testing. I added a new operation that returns an object. This object is also returned as part of the response of my first operation. This object is added under schemas, but not referenced using ref in the other response. Repro has been updated.

I'm struggling to see how this is supposed to work. One of the main use-cases for OpenApi is to generate code. In order to do that, we need the name of types being used; it must be possible to do new Something(). We could come up with our own naming algorithm, but over time that probably wouldn't be stable. When inspecting a schema transformer in the debugger, I see an annotation x-schema-id. If that was made available in the output document, we would at least have something.

Added an enum, and that is added to the schemas section and referenced where it's being used. This is what I would have expected for all object types.

I added another operation that returns an IEnumerable of a type returned as part of the response of other operations. I shows that it returns 200, but doesn't contain any information about the type; not inlined or by reference. None of the operations show any schema data for responses.

"/Test/address": {
           "get": {
               "tags": [
                   "Schema"
               ],
               "operationId": "Addresses",
               "responses": {
                   "200": {
                       "description": "OK",
                       "content": {
                           "application/json": {}
                       }
                   }
               }
           }
       }

@captainsafia
Copy link
Member

This is not recursion, the class names just happen to be named similarly. If I add extra types with very different names, they are still inlined.

The recursion I was referring to was in the code that applies the updates to $ref in the child properties. Currently, it only recurses into the first level of schema references in an object. I wasn't referring to recursive types themselves. EchoData3 is demonstrating the bug I explained in #57799 (comment) as a result of the current reference implementation.

OK, so some details about how things work under the hood for those who are curious.

There are two components at play when we generate schemas for OpenAPI documents:

  • The JsonSchemaExporter API that is exposed by System.Text.Json. This API takes a Type as input and returns an opaque JsonNode containing the schemas associated with the type.
  • The OpenApiSchemaService in Microsoft.AspNetCore.OpenApi that calls into JsonSchemaExporter, applies some mutations to make the schemas OpenAPI-compliant, and converts the returned JsonNode into an OpenApiSchema object.

Because of this two step process, there's not a discernible location for us to convert inline schemas to ref-based schemas as we are discovering types. Let's take a look at the EchoData example:

public class EchoData
{
    public EchoData2? Level2 { get; set; }
}
  1. We call JsonSchemaExporter.GetJsonSchemaAsNode(typeof(EchoData)) with a TransformSchemaNode that applies OpenAPI-specific modifications to the schema as it's constructed. This callback gets called on each "node" in the type hierarchy: the top-level EchoData type, the EchoData2? Level2 property, the EchoData3? Level3 property, and so on all the way until we reach the terminal values in the hierarchy (e.g. the string Name property on the Topic).
  2. The schema for the EchoData type is returned to the OpenApiSchemaService as a JsonNode. In our TransformSchemaNode, we set an x-schema-id on each complex type schema to indicate that it should be mapped by ref later on.
  3. The OpenApiSchemaService maps the JsonNode to an OpenApiSchema.
  4. The OpenApiSchemaService runs schema filters on the OpenApiSchema from Step 4.
  5. The OpenApiSchemaService populates the OpenApiSchema into a schema cache. Currently, it only populates the top-level schema and the first level of sub-schemas in a given type. This is the bug.
  6. The OpenApiSchema with all properties inlined is returned to the caller and set in the Open API document.
  7. Much later, the OpenApiSchemaReferenceTransformer iterates through the type hierarchy (it does recursion properly) and finds schemas that are in the cache and does the mapping. Since we only apply it up to the first-level in Step 5, you see the behavior that you are seeing.

The biggest difference between Swashbuckle's schema implementation and M.A.OpenApi's is the fact that we take a dependency on System.Text.Json's APIs for schema generation, where as Swashbuckle has its own bespoke schema generation implementation. There's benefits to using the STJ APIs:

  • They are native AoT friendly by default
  • They evolve with the actual System.Text.Json implementation so they'll respond better to new features in the System.Text.Json serializer
  • The STJ schema generation is lower in the stack and is more likely to be used for JsonSchema generation in other scenarios in the ecosystem (e.g. as the default schema type in the underlying Microsoft.OpenApi library)

It does come with one noticeable downside: Swashbuckle can construct OpenApiSchemas directly as it walks the type hierarchy and can manage the mapping of inline schemas to refs as it construct it, we have to mediate between the APIs provided by STJ, the OpenApiSchemas the document expects, and the inline references that ultimately need to be mapped.

We could try to apply the schema reference caching in the TransformSchemaNode step to get the recursion right, but that would require us to allocate more OpenApiSchema objects in the implementation that would considerably impact the perf of this area.

I hope that clarifies the confusion as to the current behavior and why we map references after schemas have been generated instead of during the schema generation process. The long and short of it being that we don't have any good APIs to do it during the moment.

There have been conversations about moving the $ref handling down into the System.Text.Json layer in future releases. That would remove the requirement of the OpenAPI implementation to do anything. When we get to that eventuality, modifying an implementation that applies the $ref mapping after the fact will be much easier than modifying one that does it during construction.

TL;DR: We need to fix the code referenced above to examine more deeply nested structures.

Let me know if you have any follow-up questions on this bit. I'll respond to the issue you're seeing with response schemas in a different comment...

@captainsafia
Copy link
Member

I added another operation that returns an IEnumerable of a type returned as part of the response of other operations. I shows that it returns 200, but doesn't contain any information about the type; not inlined or by reference. None of the operations show any schema data for responses.

I believe issue here is the top-level [Produces("application/json")] attribute in the controller class. It looks like the bug is here:

// MVC's `ProducesAttribute` doesn't implement the produces metadata that the ApiExplorer
// looks for when generating ApiResponseFormats above so we need to pull the content
// types defined there separately.
var explicitContentTypes = apiDescription.ActionDescriptor.EndpointMetadata
.OfType<ProducesAttribute>()
.SelectMany(attr => attr.ContentTypes);
foreach (var contentType in explicitContentTypes)
{
response.Content[contentType] = new OpenApiMediaType();
}

We'll want to TryAdd instead of overriding existing content-types here.

I'll see if we can get a bug fixes in for these before .NET 9 GA...

@captainsafia captainsafia self-assigned this Sep 13, 2024
@dnv-kimbell
Copy link
Author

dnv-kimbell commented Sep 13, 2024

I always appreciate more detailed information.

With the current implementation, I'm struggling to see how this will work in the real world. Applications will be sending deep object structures. In order to create these, languages such as C# will require things to be represented as set of classes.

Since anything below level 1 is inlined, any code generator needs to come up with names for these inline types. If the same type is referenced multiple times in different responses on the server, you risk ending up with two different objects on the client that should be the same.

Some application out there will be returning an address as part of a GET response, and taking the same address in a POST request. It would be nice if the client also could use the same type for both operations.

@captainsafia
Copy link
Member

@dnv-kimbell Both of the bugs that you reported are now resolved in the latest .NET 9 RC 2 builds of the package. Can you try them out and see how things are looking for you?

To access the nightly builds, add the following package reference to your .csproj:

<PackageReference Include="Microsoft.AspNetCore.OpenApi" Version="9.0.0-rc.2" />

And use the following nuget.config file:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <packageSources>
    <!--To inherit the global NuGet package sources remove the <clear/> line below -->
    <clear />
    <add key="nuget" value="https://api.nuget.org/v3/index.json" />
    <add key="dotnet9" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet9/nuget/v3/index.json" />
  </packageSources>
</configuration>

@captainsafia captainsafia added Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. and removed Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. labels Sep 19, 2024
@dnv-kimbell
Copy link
Author

dnv-kimbell commented Sep 20, 2024

These cases seem to be resolved. In order to get it to work, I had to use
<PackageReference Include="Microsoft.AspNetCore.OpenApi" Version="9.0.0-rc.2.24469.4" />

Now that I have moved passed this issue, I have found other things that don't work as expected. Will file different issues for those cases.

#57979
#57980
#57982

@dotnet-policy-service dotnet-policy-service bot added Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. and removed Needs: Author Feedback The author of this issue needs to respond in order for us to continue investigating this issue. labels Sep 20, 2024
@captainsafia
Copy link
Member

Great! I'll close this out and we can track the other issues separately.

@captainsafia captainsafia removed the Needs: Attention 👋 This issue needs the attention of a contributor, typically because the OP has provided an update. label Sep 20, 2024
@captainsafia captainsafia added this to the 9.0-rc2 milestone Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-minimal Includes minimal APIs, endpoint filters, parameter binding, request delegate generator etc feature-openapi
Projects
None yet
Development

No branches or pull requests

3 participants