Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: JSON Schema Ecosystem Metrics #518

Closed
benjagm opened this issue Nov 2, 2023 · 16 comments
Closed

Proposal: JSON Schema Ecosystem Metrics #518

benjagm opened this issue Nov 2, 2023 · 16 comments
Assignees
Labels
✨ Enhancement Indicates that the issue suggests an improvement or new feature. Status: Do not close This is a long term issue with dependant issues. This label prevent it to be closed automatically. Status: In Progress This issue is being worked on, and has someone assigned.

Comments

@benjagm
Copy link
Collaborator

benjagm commented Nov 2, 2023

Background and Rationale

JSON Schema is a fundamental technology massively used in the industry, however it is not easy to measure its usage and adoption because is not a tool or service, instead is a specification with hundreds of implementations and countless use cases. This difficulty to measure the Ecosystem can affect our ability to generate trust and to attract sponsors and partners.

This proposal aims to define a framework for collecting, analyzing, and reporting relevant Ecosystem metrics.

Proposal

  • GitHub projects using JSON Schema topics.

    • Total number of projects.
    • Total number of stars.
    • Total number of contributors.
    • Total number of forks.
    • Total number of dependant projects.
  • Github projects using any of the implementations listed on the implementers page.

    • Total number of projects.
    • Total number of stars.
    • Total number of contributors.
    • Total number of forks.
    • Total number of dependant projects.
  • Data of the Top 5 projects by language.

  • We will add to this the adopters that are self-reporting in our adopters file.

  • It would be great to be able to group the results by programming language.

Implementation Plan

We plan to implement these metrics by:

  • Developing a set of data collection tools and scripts using the GitHub API.
  • Regularly reporting the metrics to the community through a dedicated dashboard or report.

Some ideas of similar projects:

We invite the community to participate in the discussion and contribute to this effort. Your feedback and collaboration are essential to the success of this initiative.

Collaboration and Volunteers

If you're interested in collaborating on this project or have skills in data analysis, tool development, or project management, please express your interest in the comments. We welcome all forms of collaboration and support.

This issue serves as a starting point for discussion and collaboration. Let's work together to define and implement ecosystem metrics for JSON Schema that will benefit the entire open-source community.

@Relequestual Relequestual self-assigned this Nov 3, 2023
@Julian
Copy link
Member

Julian commented Nov 6, 2023

In case it helps, the hack for number of contributors is to set per_page to 1 and then look at the last page.

E.g. here with the CLI for the first result in the topic, with some silly Link header parsing:

⊙  gh api 'https://api.github.com/repos/tiangolo/fastapi/contributors?per_page=1' --include -X HEAD | rg -o '<.*page=(\d+)>; rel="last"' -r '$1'
464

Obviously some other stuff is available in the CLI too.

⊙  gh repo view json-schema-org/json-schema-spec --json forkCount,stargazerCount,watchers
{
  "forkCount": 261,
  "stargazerCount": 2959,
  "watchers": {
    "totalCount": 101
  }
}

I'll look to see about the rest of the data, a bunch more should be fairly easily retrievable.

@benjagm
Copy link
Collaborator Author

benjagm commented Nov 6, 2023

With all the info provided by Julian the only pending element are the dependant projects but that will require web-scraping or using another tool like https://pypi.org/project/github-dependents-info/ because that is not provided by the api:

This is the information I am looking for but aggregated:
https://github.com/ajv-validator/ajv/network/dependents

@Julian
Copy link
Member

Julian commented Nov 6, 2023

It does indeed initially seem like that dependency data isn't available via API... :/

@jdesrosiers
Copy link
Member

Do we know how "dependents" is calculated? It might be more or less useful depending on how it works. For example, ajv is used by eslint. Would every project that uses eslint be considered a dependent? That would create a lot of noise making it not a very useful metric.

@Julian
Copy link
Member

Julian commented Nov 6, 2023

AFAIK how it's calculated is language/tooling specific (e.g. in Python there was a long running issue about supporting pyproject.toml files which affected numbers IIRC). That one was finally fixed a few months ago. So what's in there is best-current-effort essentially on what GitHub supports. Further detail I'm sure is in the GH Docs.

IME it's indeed very noisy and not very useful, I essentially never look at it for my own repos -- but all this obviously depends on what question someone's trying to answer. But it's what's there, and I assumed the hope was maybe we could find some signal in the noise anyhow.

@Relequestual
Copy link
Member

I've created a repo for this work https://github.com/json-schema-org/ecosystem

@benjagm benjagm added the ✨ Enhancement Indicates that the issue suggests an improvement or new feature. label Nov 22, 2023
@Relequestual
Copy link
Member

Do we know how "dependents" is calculated? It might be more or less useful depending on how it works. For example, ajv is used by eslint. Would every project that uses eslint be considered a dependent? That would create a lot of noise making it not a very useful metric.

I think we can look for directly depended on vs being a transitive dependency.
That might actually be pretty interesting and to see if that differes across languages.
Although, I suspect javascript/node.js has more dependancy trees than most other languages.

@Relequestual
Copy link
Member

The first three Issues in the ecosystem repo cover:

  • Repos that use the json-schema topic over time
  • ...and their stars and forks over time
  • ...and their contributions over time

Initially, we will look to get current data and ongoing data on a weekly basis.

@benjagm benjagm added the Status: In Progress This issue is being worked on, and has someone assigned. label Feb 24, 2024
@benjagm
Copy link
Collaborator Author

benjagm commented Feb 24, 2024

The first three Issues in the ecosystem repo cover:

Do you like to get help on those issues?

@Relequestual
Copy link
Member

The first three Issues in the ecosystem repo cover:

Do you like to get help on those issues?

Yes. I think I should commit the work I have done so far on a branch. I probably should not spend much more time on this. I will spend a little more time. I think it is working but I need to run it and wait for it to complete.

@Relequestual
Copy link
Member

I have pushed some code to https://github.com/json-schema-org/ecosystem/tree/main/projects/initial-data
It runs and produces results, but it has limitations that need addressing.
So far, the code only gathers initial data, and not ongoing data via actions.

@Relequestual
Copy link
Member

The first three Issues in the ecosystem repo cover:

Do you like to get help on those issues?

I've added some new Issues and details to existing Issues where required.
See json-schema-org/ecosystem#1
Help now welcome. Feel free to communicate this to anyone hungry to contribute =]

@aialok
Copy link
Collaborator

aialok commented Mar 21, 2024

Hey maintainers! I am really interested in this project. It looks cool to me, especially working with APIs and backend stuff. I love to work on this with you. Currently, I am researching and checking out the codebases. I will ping you once I am done with some good work. I am ready to work on these issues: https://github.com/json-schema-org/ecosystem/issues.

@benjagm
Copy link
Collaborator Author

benjagm commented Apr 9, 2024

Should we close this issue to continue the work in the ecosystem repo?

Copy link

Hello! 👋

This issue has been automatically marked as stale due to inactivity 😴

It will be closed in 180 days if no further activity occurs. To keep it active, please add a comment with more details.

There can be many reasons why a specific issue has no activity. The most probable cause is a lack of time, not a lack of interest.

Let us figure out together how to push this issue forward. Connect with us through our slack channel : https://json-schema.org/slack

Thank you for your patience ❤️

@github-actions github-actions bot added the Status: Stale It's believed that this issue is no longer important to the requestor. label Jun 23, 2024
@benjagm benjagm added Status: Do not close This is a long term issue with dependant issues. This label prevent it to be closed automatically. and removed Status: Stale It's believed that this issue is no longer important to the requestor. labels Jun 23, 2024
@benjagm
Copy link
Collaborator Author

benjagm commented Jun 23, 2024

Let's continue the discussion in the Ecosystem repo. Thank you all!

https://github.com/json-schema-org/ecosystem

@benjagm benjagm closed this as completed Jun 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ Enhancement Indicates that the issue suggests an improvement or new feature. Status: Do not close This is a long term issue with dependant issues. This label prevent it to be closed automatically. Status: In Progress This issue is being worked on, and has someone assigned.
Projects
None yet
Development

No branches or pull requests

5 participants