Make `generate_release_notes.py` much faster #5001

fingolfin · 2022-08-17T15:42:08Z

Right now generate_release_notes.py takes half an hour or so to query the 192 relevant PRs from the GitHub website.

That's really bad, and I am sure we can do better. Indeed using the gh command line tool I can easily execute the relevant query in about 5 seconds, producing JSON that's not so far from what we need. Note that unlike our existing script, I am using the merged filter / mergedAt property (instead of closed / closedAt) which reduces the number of matches server side, and then I also tell GitHub to filter out anything with a certain label. This alone is not explaining the several orders of magnitude difference in performance, but they do contribute.

gh pr list --search 'merged:>=2019-09-09 -label:"release notes: not needed"' --json number,title,closedAt,labels,mergedAt --limit 200

(Actually, I first run this with a smaller limit to find out how many matches there are, then set the limit high enough to get all of them).

The text was updated successfully, but these errors were encountered:

fingolfin · 2022-08-17T15:56:35Z

The following Python code converts the output of the above gh command into the format used by

#!/usr/bin/env python3
import json

def main():
    with open("prscache-gh.json", "r") as read_file:
        prs = json.load(read_file)

    new_prs = dict()
    for pr in prs:
        new_prs[str(pr["number"])] = {
            "title": pr["title"],
            "closed_at": pr["closedAt"],
            #"merged_at": pr["mergedAt"],
            "labels": [x["name"] for x in pr["labels"]],
        }

    with open("prscache.json", "w", encoding="utf-8") as f:
        json.dump(new_prs, f, ensure_ascii=False, indent=4)


if __name__ == "__main__":
    main()

fingolfin · 2022-08-18T13:49:06Z

We can do this ourselves using GraphQL. Here is a basic query that essentially gets the data we want (I am sure it could be improved further):

{
  search(
    query: "repo:gap-system/gap merged:>=2019-09-09 -label:'release notes: not needed'"
    type: ISSUE
    last: 100
  ) {
    issueCount
    edges {
      node {
        ... on PullRequest {
          title
          number
          createdAt
          mergedAt
          labels(first: 10) {
            nodes {
              name
            }
          }
        }
      }
    }
  }
}

One can experiment with it interactively via https://docs.github.com/en/graphql/overview/explorer

Here are some ways how one can easily post this from Python: https://stackoverflow.com/questions/45957784/. Of course some more work is required to properly deal with pagination; and perhaps we also will want to (need to) use a GitHub token. But it's a good start I think.

fingolfin · 2022-10-20T14:11:42Z

For minor releases, we want a slightly different query: we also want to require label:"backport-to-4.12-DONE".

Also the start date can be computed from the previous tag, which is either e.g. v4.12.0 for minor release (here: v4.12.1) or for a major release like v4.12.0 the tag to use is the v4.13dev tag (or alternatively: git merge-base stable-4.12 master).

All these could be derived from the new version: i.e. it should suffice to say something like release 4.12.1 and it should automatically be able to determine all relevant tags, dates, queries. etc.

fingolfin added the topic: infrastructure label Aug 17, 2022

fingolfin added this to the GAP 4.13.0 milestone Dec 15, 2022

fingolfin mentioned this issue Apr 5, 2023

Preparing for GAP 4.13.0 #5431

Closed

fingolfin mentioned this issue Jan 29, 2024

New script generating release notes #5613

Merged

fingolfin closed this as completed in #5613 Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `generate_release_notes.py` much faster #5001

Make `generate_release_notes.py` much faster #5001

fingolfin commented Aug 17, 2022 •

edited

Loading

fingolfin commented Aug 17, 2022

fingolfin commented Aug 18, 2022

fingolfin commented Oct 20, 2022

Make generate_release_notes.py much faster #5001

Make generate_release_notes.py much faster #5001

Comments

fingolfin commented Aug 17, 2022 • edited Loading

fingolfin commented Aug 17, 2022

fingolfin commented Aug 18, 2022

fingolfin commented Oct 20, 2022

Make `generate_release_notes.py` much faster #5001

Make `generate_release_notes.py` much faster #5001

fingolfin commented Aug 17, 2022 •

edited

Loading