Caching issue/PR data #18

choldgraf · 2019-12-01T17:42:36Z

Sometimes it's useful to store issues / PRs / etc if you want to analyze them later. This wouldn't be useful for generating changelogs (since you want to make sure you've got the latest activity for those) but it could be useful for generating datasets that one can analyze with, e.g., https://github.com/choldgraf/jupyter-activity-snapshot.

Perhaps this could keep a cache folder in ~/data_github_activity that would keep this data over time. A few points / questions:

It could either be a single CSV files for all the data, a couple CSV files for different types of data (e.g., issues.csv, prs.csv, comments.csv), or sub-folders for different github orgs/repos
When new data is downloaded, it could do simple joins on these CSV files and then drop the duplicates based on the unique ID of that item

@consideRatio what do you think about this? Useful or unnecessary complexity?

The text was updated successfully, but these errors were encountered:

consideRatio · 2019-12-01T17:56:32Z

Hmmm, i dont want to influence you much on this as i represent a very specific need about changelog generation mainly, but i think its not out of scope for the github-activity project to allow for output to csv or json etc that are more suitable to process from disk than a markdown file.

I can imagine we could do some nice things from this. Perhaps putting out systematic metrics for releases that could be fun to look at between the projects etc.

How long time since last release, how many prs, how large prs, how many people contributed, was that an increase or decrease, etc etc, assuming we start to analyze data more.

choldgraf mentioned this issue Feb 20, 2020

Extend GraphQL request schema to pull statistics for PRs #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching issue/PR data #18

Caching issue/PR data #18

choldgraf commented Dec 1, 2019

consideRatio commented Dec 1, 2019

Caching issue/PR data #18

Caching issue/PR data #18

Comments

choldgraf commented Dec 1, 2019

consideRatio commented Dec 1, 2019