-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code generation validation does not support separate SDKs #3309
Comments
Hi @heaths About your question:
I don't think it will cause any issues. if it's generating multi sdks in a swagger PR, sdk automation script will run autorest command for multi times, and there is no duplication issue. If it generates a sdk and the swaggers refer to the same common class, there will be no error reported. The reason of getting duplication issue in Azure/azure-rest-api-specs#19012 is that it actually contains the duplicated schema name in swaggers. For example, |
Shouldn't the lenient deduplication take care of that and fold them into a single class if the definitions are the same? |
No. If sdk owner don't alow deduplicated model in generating sdk, the pipeline will report error. Based on my knowledge, python sdk owner don't allow it. |
Ultimately, the problem here is that all those swaggers in the Language RP folder do not comprise a single SDK. What might be good for the automation is to have a way - perhaps in readme.md - to better separate those. Do we, for example, define different tags for different sets of swaggers? /cc @weshaggard @johanste with whom I've been talking about this topic. This isn't the only data plane RP in this situation. Key Vault also has numerous swaggers and SDKs, though those swaggers aren't necessarily meant to stand alone and could be aggregated - though, the generated SDK is still nothing like what we ship so I still think trying to generate one is of dubious value. Perhaps better is, for data plane RPs in the specs repo that have SDKs in the language repo, is to update those autorest.md files and regenerate. It's the only way to be sure something wasn't broken. Blindly generating SDKs purely from swaggers provides some level of correctness, but not enough, IMO. |
Hi @heaths I think you are focused on dataplane now. Actually, the integration of DPG with sdk automation pipeline is on the way. The way of integration will use the autorest.md in sdk repository, and it should meet your requirement (service team can add model deplication related configuration in the autorest.md). |
@weshaggard does this mean any affected autorest.md files are going to have to add the "feature" tags (or whatever we called them) we talked about, such that we have separate permutations of versions + versions for package tags? @dw511214992 a lot of the Azure SDK teams works on data plane services, so I am focused on them yes. This is where we're seeing a lot of validation problems, as well as procedural problems. |
@heaths I deleted the useless sdk automation pipelines in Azure/azure-rest-api-specs#19000, so there shouldn't be many validation problems now. Besides, when the support for DPG is GA, I think we can enable more services on DPG automation pipeline. |
DPG won't solve this problem. These swaggers contain conflicting models and these SDKs will generate models eventually, so the problem will return. You merged that PR May 10th and problem still happens: https://github.com/Azure/azure-rest-api-specs/pull/19190/checks?check_run_id=7189271735 |
Happened against today: https://github.com/Azure/azure-rest-api-specs/pull/19558/checks?check_run_id=7218031801 |
@dw511214992 @raych1 could you try to help understand how we make our tools work if there are multiple sdk's in the same directory. |
@heaths The two PRs you mentioned is another issue, which is not related to this issue. |
@weshaggard , it could be supported by separated readme.md. |
I opened this issue. ;) Regardless of the pipeline used, the assumption that validation can build a single SDK from the entirety of the directory contents is incorrect. Same is true for several SDKs including Key Vault. We build multiple data plane SDKs from different files in the RP directory, and swaggers are often written with that in mind; ergo, you can't simply merge them all together to generate an SDK. |
@heaths , SDK generation base on the readme.md. It needs to use different readme to support multiple SDK configuration. With that, it needs to add subfolders to organize swaggers per SDK under data-plane folder then place readme under these subfolders. This is actually what cognitive service do right now. |
The folders under the data-plane directory are RPs (resource providers). The Language folder is a single RP that ships multiple SDKs. Same for Key Vault. It's a single RP. Those folders aren't merely for organization of SDKs. Validation needs to support this. /cc @maririos |
@heaths perhaps I'm missing something but if we were able to split the Key Vault folder into multiple folders couldn't we achieve the goal of having the files split by folder? For each sdk generation we will want to have one readme configuration in one folder. |
Those folders comprise a single RP and should be together per conversations with @johanste. IMO, the swagger repo should be a more git-friendly process for authoring by humans instead of trying to make it work for tools. Why can't a readme just have separate sections with different /cc @markweitzel |
Adding @josefree who is working on documentation for the layout of the specs repo but the one thing we did try to align on is trying to keep one SDK limited to one folder and one readme. Part of the reason for that decision is to not have to teach all our tools how to read and understand the readme file as sort of a project file to group the various specs together. The readme should hopefully be limited to the code generation and not all the other tools we need to run. |
I think what I was trying to say that there is not a 1:1 mapping between RP and client library - even more so for data planes than for control planes. I would say that we want one readme.md per client library (because it's purpose in life is 90+% to drive client code gen), but that is up for debate as well since we want to have some way of describing what endpoints constitutes a single "service". |
Then we should talk again about having subdirectories under the RP folder. This was one idea we discussed as an alternative to this class of issue. |
@heaths , can you point me the location of Language folder? I didn't find it. |
@raych1 I believe https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/data-plane/Language is what is being referred to.
@heaths isn't that what cognitive is doing with the Language folder or are you thinking about splitting at a different level? |
We don't have subfolders under the Language folder. We've discussed it briefly in the past, but is that even supported by validation and other tools? |
Language itself is a subfolder and as such we should support other folder next to it but I would also assume we would support other subfolders under that but I will leave that for @raych1 to confirm. |
It's the RP folder. All services I've looked at (many) are organized the same way. As @johanste said above, these swaggers comprise a single RP. |
@heaths , are you saying for this RP folder https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/data-plane/Language, service team needs to build multiple SDK package base on different swaggers and you would like the SDK automation tools support one readme file under this |
Yes, that set of swaggers produces different SDKs. How it's implemented - whether as separate tags in a readme or whatever - isn't as important. Just let me know what to change. |
@raych1 the primary question is can we have subdirectories under there with one readme in each subdirectory? If so I think that will allow us to hold to one readme for each library rule. If not we might need to reconsider some other options. |
@weshaggard @heaths , yes, it's supported to have one readme in each subdirectory of |
The bottom line is we should have one and only one autorest.md in each subdirectory, and each autorest.md generates one SDK package. must not allow an autorest.md to generate multiple SDK packages, which means using 'tag' to differentiate SDK package must be avoid. This will lead to a mess of version line from API to SDK. |
@heaths , I close this issue as we aligned on the approach to take. |
Re: Azure/azure-rest-api-specs#19012
For Cognitive Services, as well as other services like Key Vault, we produce a number of SDKs with different swaggers. The validation may fail because different SDKs - different sets of swaggers - may refer to a common class that then gets duplicated or seems like it. In practice, however, this wouldn't happen because they ship in separate SDKs.
Could we have some way to either suppress or handle these situations so we don't have validation failures like this?
The text was updated successfully, but these errors were encountered: