You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sometimes it's useful to store issues / PRs / etc if you want to analyze them later. This wouldn't be useful for generating changelogs (since you want to make sure you've got the latest activity for those) but it could be useful for generating datasets that one can analyze with, e.g., https://github.com/choldgraf/jupyter-activity-snapshot.
Perhaps this could keep a cache folder in ~/data_github_activity that would keep this data over time. A few points / questions:
It could either be a single CSV files for all the data, a couple CSV files for different types of data (e.g., issues.csv, prs.csv, comments.csv), or sub-folders for different github orgs/repos
When new data is downloaded, it could do simple joins on these CSV files and then drop the duplicates based on the unique ID of that item
@consideRatio what do you think about this? Useful or unnecessary complexity?
The text was updated successfully, but these errors were encountered:
Hmmm, i dont want to influence you much on this as i represent a very specific need about changelog generation mainly, but i think its not out of scope for the github-activity project to allow for output to csv or json etc that are more suitable to process from disk than a markdown file.
I can imagine we could do some nice things from this. Perhaps putting out systematic metrics for releases that could be fun to look at between the projects etc.
How long time since last release, how many prs, how large prs, how many people contributed, was that an increase or decrease, etc etc, assuming we start to analyze data more.
Sometimes it's useful to store issues / PRs / etc if you want to analyze them later. This wouldn't be useful for generating changelogs (since you want to make sure you've got the latest activity for those) but it could be useful for generating datasets that one can analyze with, e.g., https://github.com/choldgraf/jupyter-activity-snapshot.
Perhaps this could keep a cache folder in
~/data_github_activity
that would keep this data over time. A few points / questions:issues.csv
,prs.csv
,comments.csv
), or sub-folders for different github orgs/repos@consideRatio what do you think about this? Useful or unnecessary complexity?
The text was updated successfully, but these errors were encountered: