Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Breaking Change][PR Workflow] Use more granular labels for Breaking Changes approvals #6374

Closed
31 tasks done
mikekistler opened this issue Jun 20, 2023 · 24 comments
Closed
31 tasks done
Assignees
Labels
Breaking Changes Improvements to Breaking Changes tooling Central-EngSys This issue is owned by the Engineering System team. Spec PR Tools Tooling that runs in azure-rest-api-specs repo.

Comments

@mikekistler
Copy link
Member

mikekistler commented Jun 20, 2023

Implementation plan

By @konrad-jamrozik

Implementation plan for the design in this gist with extra details in this Word design doc (see also this comment):

Prerequisite implementation tasks

  • Deduplicate relevant openapi-alps code, like Label or PrKey, into swagger-validation-common package.

No-op docs work

  • Update the "Automated Merging Requirements Met" GitHub check pane. (no work was necessary here)
  • Update the Breaking Changes / Same Version GitHub check pane. (no work was necessary here)
  • Update the Breaking Changes / Same Version ADO build logs. (no work was necessary here)
  • Update the Breaking Changes / Cross-Version GitHub check pane. (no work was necessary here)
  • Update the Breaking Changes / Cross-Version ADO build logs. (no work was necessary here)

Most recent proposal

For most recent proposal, see the most recent comment on this issue.

Original description 6/20/2023

By @mikekistler:

The current breaking change process uses the "BreakingChangeReviewRequired" or "NewApiVersionRequired" labels to signal that a breaking change review is required, but only one label "Approved-BreakingChange" for approval by the Breaking Change review board. But there are several different reasons that the board may approve, and we'd like to use the label to designate the reason for approval.

  1. The changes to the REST API definition are not breaking at the REST API level and have at most minor impact to generated SDKs, so the change is allowed.
  2. The changes to the REST API definition appear to be breaking but are actually correcting the definition to correctly describe the actual service behavior.
  3. The PR updates an existing API version, which violates the Azure versioning policy, but is allowed because either
    • the target API version has not yet been made available to customers, or
    • the target API version is in private preview and the changes are not breaking
  4. The PR describes actual breaking changes to the service. This is the only case that triggers customer notifications, a 3 yr migration period (possibly shorter), etc.
    • Within this set we may want to distinguish changes for security or legal compliance, which may be granted a shorter migration period.

Related work:

@mikekistler mikekistler added Spec PR Tools Tooling that runs in azure-rest-api-specs repo. Breaking Changes Improvements to Breaking Changes tooling labels Jun 20, 2023
@konrad-jamrozik konrad-jamrozik changed the title Use more granular labels for Breaking Changes approvals [PR Workflow] Use more granular labels for Breaking Changes approvals Jun 27, 2023
@konrad-jamrozik konrad-jamrozik self-assigned this Jun 27, 2023
@konrad-jamrozik konrad-jamrozik added the Central-EngSys This issue is owned by the Engineering System team. label Jun 27, 2023
@konrad-jamrozik
Copy link
Contributor

@mikekistler to go and propose the labels. Plus mention that "new api version required" approves a change, not a breaking change.

@konrad-jamrozik konrad-jamrozik changed the title [PR Workflow] Use more granular labels for Breaking Changes approvals [Breaking Change][PR Workflow] Use more granular labels for Breaking Changes approvals Jun 27, 2023
@mikekistler
Copy link
Member Author

Here's my proposal for revised/expanded labels for breaking change approvals:

BreakingChange-Benign

  • The changes to the REST API definition are not breaking at the REST API level and have at most minor impact to generated SDKs, so the change is allowed.
  • Unblocks merge of PR with "BreakingChangeReviewRequired"

BreakingChange-Correction

  • The changes to the REST API definition appear to be breaking but are actually correcting the definition to correctly describe the actual service behavior.
  • Unblocks merge of PR with "BreakingChangeReviewRequired"
  • This could be self-selected

SameApiVersion-Approved

  • The PR updates an existing API version, which violates the Azure versioning policy, but is allowed because either
    • the target API version has not yet been made available to customers, or
    • the target API version is in private preview and the changes are not breaking
  • Unblocks merge of PR with "NewApiVersionRequired"
  • This could be self-selected

BreakingChange-Approved

  • The PR describes actual breaking changes to the service. This is the only case that triggers customer notifications, a 3 yr migration period (possibly shorter), etc.
    • Within this set we may want to distinguish changes for security or legal compliance, which may be granted a shorter migration period.
  • Unblocks merge of PR with "BreakingChangeReviewRequired"

cc: @JeffreyRichter @josefree

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Aug 1, 2023

I am thinking that the automation, once it detects that the breaking changes have been approved (due to any of the scenarios you listed above) should have another mechanism to determine the PR author can proceed to the next stage, which, in case of ARM PRs, is ARM review. I am thinking a PR that requires ARM review but has unaddressed breaking changes should have a (new) label NotReadyForARMReview which get removed when the breaking changes are approved.

Re implementation, the work I see to get this done includes:

and support for the new approval labels in:

Related doc:

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Aug 25, 2023

TODO: I need to compile the final guidance from all the various conversations I had. Specifically see email thread with Subject: RE: Streamlining Breaking Change reviews and Re: "one pager" on Breaking Changes for CTO Forum.

@konrad-jamrozik
Copy link
Contributor

We will also add another label for issue with the tool, as explained in:

@konrad-jamrozik
Copy link
Contributor

We may also need a label like Approved-TentingInProduction. For context, see email thread Re: Requesting review and approval for the breaking change in Enterprise Billing Service.

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Nov 20, 2023

There is a related idea of having BreakingChange-ToolingIssue mentioned in the email thread with subject RE: REST API Tools planning. In addition, there is more infor about preview from on the thread Re: API versions: What is the policy for releasing updated private or public API preview when there is no GA?.

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Dec 5, 2023

To elaborate on "self-served" based on my chat with Jeff R.: appropriate message should be added to the PR, so that PR author knows they can add appropriate comment, resulting in appropriate approval, like e.g. label or suppression file, thus unblocking the PR.

The implementation of such flow must play well with:

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Jan 3, 2024

@mikekistler @JeffreyRichter after my chat with @weshaggard we have few questions.

First off, here is the more recent proposal from the email thread with subject RE: Streamlining Breaking Change reviews:

=== Proposal start ===

A PR showing NewApiVersionRequired means a non-breaking change to an existing api-version:

  • If version is in private preview; SameApiVersion-Approved (benign); service can auto-approved
  • If updating contract to match service behavior ; SameApiVersion-BugFix; service can auto-approve
  • Attempt to change service without changing api-version: SameApiVersion-NewVersionRequired

A PR showing BreakingChangeReviewRequired means a breaking change introduced to an existing or new api-version:

  • BreakingChange-NoCustomers: Explicitly allowed due to 0 or few customers
  • BreakingChange-Benign; Constraint change, introduce fault value, tooling shortcoming
  • BreakingChange-BugFix; service can auto-approve
  • BreakingChange-Approved; Board reviewed & explicitly OK'd, requires CXP notification & 3 year policy
  • If PR is closed & this label persists, then breaking was disapproved or moot

In PR, have a comment with 2 (or 3) links:

  • If NewApiVersionRequired label: If this PR is to a version in PRIVATE preview, then click here to add the SameApiVersion-Approved label
  • If NewApiVersionRequired label: If this PR is ONLY to correct the swagger to match existing service version behavior, click here to add the SameApiVersion-BugFix label
  • If BreakingChangeReviewRequired label: If this PR is ONLY to correct the swagger to match existing service version behavior, click here to add the BreakingChange-BugFix label

=== Proposal end ===

Question 1: handling auto-approval (self-service)

I see some scenarios can be auto-approved, but some can not. From the implementation point of view, we cannot clearly delineate these cases. For example, if the breaking change is a bugfix then the automation cannot know "ah, this is a bugfix, so the PR author can auto-approve". All the automation knows it is a breaking change, and all we can do is to either allow to auto-approve all the breaking changes [1] or none of the breaking changes. Similar with non-breaking changes to given API version (like a optional property addition). Either we allow to auto-approve all the non-breaking same-API-version changes or none of them.

Knowing that, do you still want for us to implement the auto-approval / self-service capability?

Question 2: abusing auto-approval by always snapping API spec to service implementation

If we allow people to auto-approve changes to given API version to match service behavior, they can abuse it in the following way:

  1. Starting situation: both API definition and service behavior match.
  2. An engineer changes the service behavior introducing a change, so now the API spec is out of sync with it.
  3. They update the API spec and auto-approve, because "the API spec was not matching the service behavior".

This scenario kind of completely circumvents the system, and yet is not obviously wrong in any way.

Given this, do you still want for us to allow people to auto-approve same-API version changes under this pretense?

Question 3: clarification of breaking change and same-API version change

Is the following correct?

  • "Same API version change" means: this change is allowed, but you must create a new API version for it [2]. E.g. new property.

  • "Breaking change" means: this change is not allowed, not even in a new API version [3]. E.g. a property removal. If it really has to happen, it must follow the strict approval process and the deprecation policy must be followed.

Footnotes

[1]: For example, if there would be BreakingChangeReviewRequired, the PR author could always add BreakingChange-BugFix and the automation has no way of determining this is actually not a bugfix, but other kind of breaking change that cannot be auto-approved.

[2]: Except the same API version changes that are allowed by the special cases as listed in the proposal.

[3]: Except the breaking changes that are allowed by the special cases as listed in the proposal.

@mikekistler
Copy link
Member Author

This is a good summary but I have a few minor tweaks:

  • I find that many teams do not have a clear understanding of what "private preview" means. I’ve had to explain this often enough that I captured my explanation so I don’t have to recreate it every time:

    An API version is "in preview” when it has been made available to customers, either under AFEC control ("private preview") or open to all customers ("public preview").
    If an API version is not available to customers, it isn’t in preview – it is still in development.

    So before a team self-approves "SameApiVersion" because they claim it is in private preview, we need to make sure they are really in private preview.

  • Following on the previous point, I think we need a separate label for the case of "testing in a production branch". This happens all the time and we have to tell teams that they should not be doing this, but they still do. In some cases it is justified but ARM is working on new support that should make it possible to test effectively without merging to RPSaaSMaster.

    Some suggestions for this label: "SameApiVersion-InTest", "SameApiVersion-NotReleased".

  • Likewise, I think we need a "BreakingChange-NotReleased" label, and maybe this can replace the "BreakingChange-NoCustomers" label.

  • Please don’t use the term "swagger" in the text of the PR comments — use "API defintion" or something similar. "Swagger" is a trademark of the SmartBear company and while I know many at Microsoft use it as a synonym for "API definition" I think we should set an example of using correct terminology particularly in public spaces such as the azure-rest-api-specs repo.

    https://swagger.io/blog/api-strategy/difference-between-swagger-and-openapi/

As for your questions:

I think we still want the self-service approval.

Yes, it should be offered on every PR with a "NewApiVersionRequired" or "BreakingChangeReviewRequired" label.

And yes, it could easily be abused, but

  • We chose the "BugFix" label intentionally in hopes that service teams would hesitate to use it unless appropriate.
  • We can and should monitor self-approvals to watch for abuse
  • We can always disable self-approvals if we determine that abuse is too common.

Regarding your second question, we should clarify what qualifies as a "BugFix". A "BugFix" means that the behavior of the service at the time of the original GA of the feature was not correctly described in the API definition. Another way to say this is that the API definition does not correctly describe the service behavior at any point in time.

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Jan 4, 2024

@mikekistler thank you! I consolidated my understanding of all the requirements for this work and proposed a new convention for label naming. Can I kindly ask you to review the document and provide your input? I want to get your sign-off before I start the implementation work to avoid rework.

Link to the doc.

CC @JeffreyRichter @weshaggard @heaths

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Jan 5, 2024

Per our chat on 1/4/2024, we concluded that for now we will implement the support for the labels and perhaps add additional support for suppressions later. Details here:

I also updated the issue description with an implementation plan:

@mikekistler
Copy link
Member Author

I left comments in the doc and also tried to summarize my (loosely held) opinions on label names and descriptions in this gist.

@konrad-jamrozik
Copy link
Contributor

Per our chat today:

  • The contents of the gist given here are the label names we agreed on.
  • The BreakingChangeReviewRequired label should be decommissioned. Now the label BreakingChange will have mean the review is required.
    • As a result, if there is a PR which has a failing breaking change check but it doesn't require a review (e.g. it is a PR to a dev branch) then it will no longer get the BreakingChange label.
  • We cannot assume if the PR authors have the permissions to add labels or not. We probably will want to support both cases, giving them an option to add a label directly, or via relevant comment.
    • If a PR author adds a label they shouldn't, the automation should remove it. E.g. we could check if the label was added by a person belonging to relevant security group and if not, immediately remove it.

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Jan 17, 2024

There have been few more updates. Newest proposals:

Note the documents above are parts of the suppressions logic discussion. All relevant docs:

@konrad-jamrozik
Copy link
Contributor

konrad-jamrozik commented Jan 30, 2024

@konrad-jamrozik
Copy link
Contributor

  • Pull Request 529441: Bugfix: the new BreakingChange-Approval-* labels now will be recognized by automation managing ARM review
  • Pull Request 528285: Make automation add VersioningReviewRequired instead of NewApiVersionRequired. Assorted refactoring and new breaking changes types for labels.
  • Pull Request 532621: Add support to workflow-bot for matching labels by prefix.

@konrad-jamrozik
Copy link
Contributor

Pull Request 9668189: Updated guidance around breaking changes process and relevant labels | Updated file(s) from Engineering Hub for content: Service Lifecycle & Actions Team

@konrad-jamrozik
Copy link
Contributor

Related work:

  • Pull Request 9694050: Clarify that breaking changes are not allowed in private preview. Updated file(s) from Engineering Hub for content: Azure Resource Manager (ARM)

@konrad-jamrozik
Copy link
Contributor

Doc capturing info from this thread, and more:

@konrad-jamrozik
Copy link
Contributor

  • Pull Request 953: Forward specs dir layout article to specs repo article; add links to APEX breaking changes; update AutoRest links

@konrad-jamrozik
Copy link
Contributor

@mikekistler @JeffreyRichter @weshaggard overall this was tons of work, mostly in writing, updating and clarifying the relevant docs. I am waiting for Shashank to update relevant APEX wiki about breaking changes intake process (see here). But my part of the work is finally done! Migration to new labels is done, code is done and deployed, etc.

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Breaking Changes Improvements to Breaking Changes tooling Central-EngSys This issue is owned by the Engineering System team. Spec PR Tools Tooling that runs in azure-rest-api-specs repo.
Projects
Archived in project
Status: 🎊 Closed
Development

No branches or pull requests

2 participants