Feature: Change how scores are displayed #2466

pnacht · 2022-11-18T20:46:35Z

Is your feature request related to a problem? Please describe.
There's a discrepancy between how good a given score is and how it feels. A 7/10 feels like a passing grade at best, but it actually means a project is in the top ~10% of the most relevant projects (or top ~1% of all projects), for example.

There have been a few maintainers who are surprised to hear that they're actually doing a good job when they get a good score.

twbs/bootstrap#37402 (comment):

That being said, 7.2 is not good enough either

numpy/numpy#22482 (comment):

The badge gives a number, 6.2 in our case. I'm not sure many people know how to interpret that number - it feels like a low score

Describe the solution you'd like
A score that feels as good as it actually is. My proposal would be to either replace or supplement the current final score (7/10) with the respective quantile (top x%). The badge should also display the result in quantiles instead of (or as well as) final scores.

This would make everyone (maintainers and users) more accurately understand how solid a project's security posture is.

Even the top projects would have a better experience: I wager some users currently see urllib3's 9.3 and think "wow, that's pretty good, but still clearly needs to improve something!", when their actual understanding should be "wow, this is the most secure open-source project out there!"

Personally, I'd be in favor of the quantile simply supplementing the final score, precisely because (for example) urllib3 might be the most secure open-source project out there, but that missing 0.7 does also point out there's room for improvement. In simple terms:

the x/10 score should be more maintainer-facing, letting them know there's still work to be done
the quantile "score" should be more consumer-facing, letting them know how secure the project is, compared to its peers.

Additional context

A first issue may be that the histogram of project scores isn't very nuanced: it seems clear from the chart below that GitHub's defaults give projects a score around 4.5/10 (charts obtained via the public BigQuery data), so the ~1 million projects analyzed by Scorecards can basically be categorized as "did something to improve their security" (and are therefore "top ~1%" of projects) or "did something to weaken their security" (and are therefore "bottom ~1%" of projects).

However, if we focus on "important" projects, the chart becomes much more useful:

Naturally, this chart is heavily influenced by how we define "important". For the chart above, I defined it as projects with a criticality_score > 0.5. This choice was completely arbitrary, and just so happens to include ~10,000 projects. Whether this cutoff is appropriate or whether criticality_score is the best tool is naturally something that can (should!) be discussed as well.

It is also worth mentioning that this curve is an almost perfect sigmoid, and therefore calculating the quantile would be quite straightforward, though the equation parameters may need to be updated over time (hopefully due to improving scores across the open-source ecosystem!):

(the vertical axis goes from -25 to 125 because the estimated curve goes slightly above 100 and below 0, but that should be easy to clamp)

laurentsimon · 2022-11-22T22:27:40Z

I like the idea. @spencerschrock @azeemsgoogle @naveensrinivasan wdut?

di · 2023-08-17T17:01:02Z

Since completely replacing the X/10 score might be disruptive, we might want to explore supplementing these scores with a percentile, like:

7/10 (top 90% percentile for this check)

spencerschrock · 2023-08-18T17:30:55Z

Since completely replacing the X/10 score might be disruptive, we might want to explore supplementing these scores with a percentile, like:

7/10 (top 90% percentile for this check)

Is this for, the badge, the result viewer, or the results themselves?

di · 2023-08-18T17:44:30Z

I'd say anywhere we display an X/10 score, we should do this as well -- we should file separate issues for the results viewer/badge as necessary.

pnacht · 2023-08-21T20:40:28Z

I'm not sure how valuable quantiles are for individual checks, especially given how many checks are "binary" (0 or 10). I also suspect (without looking at any data) that the distributions will be heavily skewed/distorted, which might lead to less nuanced quantiles (i.e. only have top 1% or top 99% quantiles).

In my initial proposal, I was actually only thinking of having quantiles for the final score, where we have a pretty reasonable ("normal-ish") distribution.

But yes, I'd then show these quantiles everywhere: the CLI output, the viewer, the badge.

github-actions · 2023-10-21T01:45:24Z

This issue is stale because it has been open for 60 days with no activity.

github-actions · 2023-12-21T01:46:47Z

This issue is stale because it has been open for 60 days with no activity.

raghavkaul · 2024-03-14T03:22:04Z

The OpenSSF Best Practices badge uses "Passing", "Silver", and "Gold" which is easy to see at a glance. libraries must pass all criteria at a level before moving on to the next level. A similar scheme for Scorecard might be: pass X probes for Silver, X + Y for gold, etc.

github-actions · 2024-05-14T01:46:49Z

This issue has been marked stale because it has been open for 60 days with no activity.

pnacht added the kind/enhancement New feature or request label Nov 18, 2022

github-actions bot added the no-issue-activity label Oct 21, 2023

github-actions bot added the Stale label Dec 21, 2023

pnacht mentioned this issue Jan 15, 2024

Feature: Give projects extra credits for "going the extra mile" #3795

Open

afmarcum removed the no-issue-activity label Feb 29, 2024

github-actions bot removed the Stale label Mar 1, 2024

github-actions bot added the Stale label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Change how scores are displayed #2466

Feature: Change how scores are displayed #2466

pnacht commented Nov 18, 2022 •

edited

Loading

laurentsimon commented Nov 22, 2022

di commented Aug 17, 2023

spencerschrock commented Aug 18, 2023

di commented Aug 18, 2023

pnacht commented Aug 21, 2023 •

edited

Loading

github-actions bot commented Oct 21, 2023

github-actions bot commented Dec 21, 2023

raghavkaul commented Mar 14, 2024

github-actions bot commented May 14, 2024

Feature: Change how scores are displayed #2466

Feature: Change how scores are displayed #2466

Comments

pnacht commented Nov 18, 2022 • edited Loading

laurentsimon commented Nov 22, 2022

di commented Aug 17, 2023

spencerschrock commented Aug 18, 2023

di commented Aug 18, 2023

pnacht commented Aug 21, 2023 • edited Loading

github-actions bot commented Oct 21, 2023

github-actions bot commented Dec 21, 2023

raghavkaul commented Mar 14, 2024

github-actions bot commented May 14, 2024

pnacht commented Nov 18, 2022 •

edited

Loading

pnacht commented Aug 21, 2023 •

edited

Loading