-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make generate_release_notes.py
much faster
#5001
Comments
The following Python code converts the output of the above #!/usr/bin/env python3
import json
def main():
with open("prscache-gh.json", "r") as read_file:
prs = json.load(read_file)
new_prs = dict()
for pr in prs:
new_prs[str(pr["number"])] = {
"title": pr["title"],
"closed_at": pr["closedAt"],
#"merged_at": pr["mergedAt"],
"labels": [x["name"] for x in pr["labels"]],
}
with open("prscache.json", "w", encoding="utf-8") as f:
json.dump(new_prs, f, ensure_ascii=False, indent=4)
if __name__ == "__main__":
main() |
We can do this ourselves using GraphQL. Here is a basic query that essentially gets the data we want (I am sure it could be improved further):
One can experiment with it interactively via https://docs.github.com/en/graphql/overview/explorer Here are some ways how one can easily post this from Python: https://stackoverflow.com/questions/45957784/. Of course some more work is required to properly deal with pagination; and perhaps we also will want to (need to) use a GitHub token. But it's a good start I think. |
For minor releases, we want a slightly different query: we also want to require Also the start date can be computed from the previous tag, which is either e.g. All these could be derived from the new version: i.e. it should suffice to say something like |
Right now
generate_release_notes.py
takes half an hour or so to query the 192 relevant PRs from the GitHub website.That's really bad, and I am sure we can do better. Indeed using the
gh
command line tool I can easily execute the relevant query in about 5 seconds, producing JSON that's not so far from what we need. Note that unlike our existing script, I am using themerged
filter /mergedAt
property (instead ofclosed
/closedAt
) which reduces the number of matches server side, and then I also tell GitHub to filter out anything with a certain label. This alone is not explaining the several orders of magnitude difference in performance, but they do contribute.(Actually, I first run this with a smaller limit to find out how many matches there are, then set the limit high enough to get all of them).
The text was updated successfully, but these errors were encountered: