-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ELPP-3358 Choosing an open source component #9
base: master
Are you sure you want to change the base?
Conversation
|
||
### Licensing | ||
|
||
[should we specify this in terms of an explicit list of acceptable licenses, or in terms of what we need a licence to allow us to do?] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering we release everything with the MIT license, there is a compatibility issue (with GPL?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an explanation on what it means to be 'compatible' with GPL licensed code: https://www.gnu.org/licenses/gpl-faq.en.html#WhatIsCompatible
and a list of compatible licences:
https://www.gnu.org/licenses/license-list.en.html#GPLCompatibleLicenses
there is a lot of argument and untested legal theorizing on edge cases in most software licences. The only perfectly clear cut legal standing, in my humble opinion, is "public domain". Even the legality of closed sourced proprietary code becomes fuzzy if it's not 100% written internally in the same country in the same decade.
also, there are different types of code usage, like linking vs bundling vs compiling. is a compiled program with gpl licensed code required to also be licenced by the GPL? (yes). what if it's unmodified and uncompiled, or compiled to an intermediate format for an interpreter (javascript, python)? what if it's called from a script? is a script a program? all sorts of fun and games here.
Ian settled on the MIT licence for eLife, and while I don't agree with that decision ideologically, it is convenient for others for all sorts of reasons. Convenience and fewer barriers and very important to adoption if adoption is one of your goals.
I would be curious to find the licences of all of the dependencies our programs use and see which large groups emerge. I'm less worried about proprietary software sneaking in than I am about those who use a licence like MIT or BSD and then 'tweak' it, adding their own custom clauses.
+1 for a licence audit. I know observer and metrics and possibly lax are still GPLd.
|
||
## Status | ||
|
||
Proposed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd tag as many people as possible here, since this has wide-ranging implications
|
||
### Well maintained | ||
|
||
- Are there a decent number of recent commits? This will indicate whether it's under active development. Avoid it if it's not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this stance a lot in javascript developers. recent activity is no indication of stability and is no reason to avoid a thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is spuriously correlated with how big a project is, e.g. the Linux kernel has many commits per day even if some areas are very stable and the release pipeline is very long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the other direction, if there is a single master branch with many commits per week it's an indication of in-stability. Number of existing releases could be a better litmus test for stability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the JavaScript world things are moving so fast that something which lacks recent commits probably is out of date. I take the point that this doesn't necessarily apply in other environments.
I think the points made are fair overall, but some of the conclusions seem impractical outside of the javascript world, like code auditing dependencies, conflating size, popularity and activity as measures of stability or security and that a dependency's ease of installation/configuration might preclude it from being maintained properly internally. Some projects have become very stable, or were never really popular, or are just very niche, or have maintainers who don't like outside help or, or, or ... Most of this ADR is common sense and I'd expect nothing less from a developer - being able to nose out problem dependencies should be like a sixth sense after a time - but if we're to make this a practical ADR then it should be less prescriptive and more of a guide to choosing and managing dependencies, including the mountain we already use daily, with a list of must haves (a compatible Free/OSS licence), nice to haves (many maintainers, community enthusiasm), common gotchas (customised licences) and warning signs (few commits + many outstanding issues/no issue tracker, exotic code). I'd like to see greater separation of popularity from quality, activity from stability and a change of 'sustainable' to something else. Perhaps 'complexity'. Giorgio and I have managed to accommodate almost everything you guys have thrown at us thus far, even IIIF. I'd also like to quantify 'quality' before we start evaluating components for it. It could be a simple points system, but preferably a set of metrics we can apply uniformly. It wouldn't condemn a thing but may invite more scrutiny of it |
👍 on must-have vs nice-to-have vs warning signs Proposal: during the inclusion of a library in a pull request, compile the checklist included in this ADR and let the maintainers of the project make an informed choice:
|
This page describes some constructive metrics for evaluating the selection of tools. Due to the variability of what is available for particular situations, to consider these as suggestions and guidelines is probably better than to think of them as strict rules to follow. In some situations -- including and not limited to areas of science and science publication -- there could be only one best choice from limited options. Another thing I like about this procedure is to help everyone to understand all the important factors in the decision making, because the way each person evaluates things may vary. We will have to ensure everyone remembers its contents, and when evaluating new components to perhaps refer back to it. |
Comments welcome. |
- [] open issue count: | ||
- [] commit count: | ||
- [] maintainer count: | ||
- anticipated effort to maintain in our environment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there was something quite complex, but was only used in one place in your environment that might be described as "low", and something thats simpler to manage but used almost everywhere might be described as "high".
Could add something for how often its planned to be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're onto something about the additional directions that can influence how big the dependency is.
This (which refers to technologies in general, not just libraries or frameworks) came to mind:
when going through this checklist for a new python dependency, a few thoughts occurred:
elaborating:
opening a PR and having Alfred run my tests is fantastically convenient.
yes, "exotic code" came from me, but it's the sort of thing you can only quantify after spending a lot of time with a language and other people's code. if alfred can tick most of these boxes automatically, we should only be alerted when one can't.
if a thing is too hard or inconvenient to do and there are no consequences or checks that it has been done, it will eventually stop being done. for example: updating documentation.
from this ADR it's clear we're able to come up with unambigious metrics we think are important. |
It is probably impractical to check all dependencies listed on every build, but checks can be made against the new ones that differ from the
|
sure. it could consult a project-local 'approved' list that is populated automatically by the tool or explicitly overridden when a human gives it a thumbs up. obviously language specific, but baby steps first. I'm thinking of the Github and Pypi APIs that give you access to a lot of project information:
like licence, downloads, version history and, in the python world, a (self-applied) stability status.
ha! I'll hold you to that ...
yeah, difficult to measure. I've never heard of a dependency shitlist before, but I suppose they must exist, like spam blacklists |
running it is easy, whether it gives a good result it's another matter |
|
||
Using third-party software that is difficult to maintain integrated in our environment can make updating it difficult, time-consuming or risky. | ||
|
||
Using software with an inappropriate license may present a legal risk, or may risk others choosing not to use our software because of the burden the license subsequently places on them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-pick: in the UK license is the verb, licence is the noun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need an ADR for British vs. American English 🇬🇧 🇺🇸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I used to have a yellow sticky note next to me on the wall that would remind me ;) I'm certain I've lapsed since then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- low complexity to maintain in our environment | ||
|
||
## Warning signs | ||
- many open issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer many open issues without activity
For PHP projects It has a
|
that's really great actually. I wonder if the Python world has some hidden built in like that I haven't seen before .. |
Just had a play around, I have been able to generate the same table for a python project using the I could also have it as an option part of the |
I prefer integrating it into the proofreader to be honest, at least for the python apps. |
The only caveat is how long it takes, if it's fast enough to be included in every build (which is where proofreader is executed), 👍 |
This is a draft for an ADR to help decision making when selecting an open source component. Feedback welcome. Pinging @elifesciences/article-viewer @elifesciences/digirati @elifesciences/coko @elifesciences/yld
Feel free to tag other relevant people on this as appropriate.