ELPP-3358 Choosing an open source component #9

davidcmoulton · 2018-01-09T13:50:49Z

This is a draft for an ADR to help decision making when selecting an open source component. Feedback welcome. Pinging @elifesciences/article-viewer @elifesciences/digirati @elifesciences/coko @elifesciences/yld

Feel free to tag other relevant people on this as appropriate.

giorgiosironi · 2018-01-09T14:13:49Z

adr/0007-choosing-an-open-source-component.md

+
+### Licensing
+
+[should we specify this in terms of an explicit list of acceptable licenses, or in terms of what we need a licence to allow us to do?]


considering we release everything with the MIT license, there is a compatibility issue (with GPL?)

an explanation on what it means to be 'compatible' with GPL licensed code: https://www.gnu.org/licenses/gpl-faq.en.html#WhatIsCompatible

and a list of compatible licences:
https://www.gnu.org/licenses/license-list.en.html#GPLCompatibleLicenses

there is a lot of argument and untested legal theorizing on edge cases in most software licences. The only perfectly clear cut legal standing, in my humble opinion, is "public domain". Even the legality of closed sourced proprietary code becomes fuzzy if it's not 100% written internally in the same country in the same decade.

also, there are different types of code usage, like linking vs bundling vs compiling. is a compiled program with gpl licensed code required to also be licenced by the GPL? (yes). what if it's unmodified and uncompiled, or compiled to an intermediate format for an interpreter (javascript, python)? what if it's called from a script? is a script a program? all sorts of fun and games here.

Ian settled on the MIT licence for eLife, and while I don't agree with that decision ideologically, it is convenient for others for all sorts of reasons. Convenience and fewer barriers and very important to adoption if adoption is one of your goals.

I would be curious to find the licences of all of the dependencies our programs use and see which large groups emerge. I'm less worried about proprietary software sneaking in than I am about those who use a licence like MIT or BSD and then 'tweak' it, adding their own custom clauses.

+1 for a licence audit. I know observer and metrics and possibly lax are still GPLd.

giorgiosironi · 2018-01-09T14:15:20Z

adr/0007-choosing-an-open-source-component.md

+
+## Status
+
+Proposed.


I'd tag as many people as possible here, since this has wide-ranging implications

lsh-0 · 2018-01-10T02:39:50Z

adr/0007-choosing-an-open-source-component.md

+
+### Well maintained
+
+- Are there a decent number of recent commits? This will indicate whether it's under active development. Avoid it if it's not.


I see this stance a lot in javascript developers. recent activity is no indication of stability and is no reason to avoid a thing.

I think this is spuriously correlated with how big a project is, e.g. the Linux kernel has many commits per day even if some areas are very stable and the release pipeline is very long.

In the other direction, if there is a single master branch with many commits per week it's an indication of in-stability. Number of existing releases could be a better litmus test for stability?

In the JavaScript world things are moving so fast that something which lacks recent commits probably is out of date. I take the point that this doesn't necessarily apply in other environments.

lsh-0 · 2018-01-10T04:11:46Z

I think the points made are fair overall, but some of the conclusions seem impractical outside of the javascript world, like code auditing dependencies, conflating size, popularity and activity as measures of stability or security and that a dependency's ease of installation/configuration might preclude it from being maintained properly internally. Some projects have become very stable, or were never really popular, or are just very niche, or have maintainers who don't like outside help or, or, or ...

Most of this ADR is common sense and I'd expect nothing less from a developer - being able to nose out problem dependencies should be like a sixth sense after a time - but if we're to make this a practical ADR then it should be less prescriptive and more of a guide to choosing and managing dependencies, including the mountain we already use daily, with a list of must haves (a compatible Free/OSS licence), nice to haves (many maintainers, community enthusiasm), common gotchas (customised licences) and warning signs (few commits + many outstanding issues/no issue tracker, exotic code).

I'd like to see greater separation of popularity from quality, activity from stability and a change of 'sustainable' to something else. Perhaps 'complexity'. Giorgio and I have managed to accommodate almost everything you guys have thrown at us thus far, even IIIF.

I'd also like to quantify 'quality' before we start evaluating components for it. It could be a simple points system, but preferably a set of metrics we can apply uniformly. It wouldn't condemn a thing but may invite more scrutiny of it

giorgiosironi · 2018-01-10T09:19:19Z

👍 on must-have vs nice-to-have vs warning signs

Proposal: during the inclusion of a library in a pull request, compile the checklist included in this ADR and let the maintainers of the project make an informed choice:

license: MIT
issue: ~1000 are open
...

gnott · 2018-01-10T22:03:47Z

This page describes some constructive metrics for evaluating the selection of tools. Due to the variability of what is available for particular situations, to consider these as suggestions and guidelines is probably better than to think of them as strict rules to follow.

In some situations -- including and not limited to areas of science and science publication -- there could be only one best choice from limited options.

Another thing I like about this procedure is to help everyone to understand all the important factors in the decision making, because the way each person evaluates things may vary. We will have to ensure everyone remembers its contents, and when evaluating new components to perhaps refer back to it.

davidcmoulton · 2018-01-18T10:19:19Z

reframed with must have; nice to have, and warning signs
removed various inferences that may not apply in all contexts
added @giorgiosironi's checklist idea

Comments welcome.

stephenwf · 2018-01-18T10:39:53Z

adr/0007-choosing-an-open-source-component.md

+- [] open issue count:
+- [] commit count:
+- [] maintainer count:
+- anticipated effort to maintain in our environment:


If there was something quite complex, but was only used in one place in your environment that might be described as "low", and something thats simpler to manage but used almost everywhere might be described as "high".

Could add something for how often its planned to be used?

You're onto something about the additional directions that can influence how big the dependency is.

This (which refers to technologies in general, not just libraries or frameworks) came to mind:

lsh-0 · 2018-01-22T03:56:19Z

when going through this checklist for a new python dependency, a few thoughts occurred:

convenience almost always trumps everything else
we love easy to check checkboxes. ambiguity and guessing wastes time and is stressful
I'd game this ADR to get what I want because of the above points
I'd prefer to spend my time automating this than gaming it

elaborating:

make this convenient

opening a PR and having Alfred run my tests is fantastically convenient.
alfred should check my dependencies and post a warning if anything is dubious.
just a simple "this library only has a few commits. are you sure?" type message
if we wanted to get fancy we could add a button "yes, I'm sure" that makes Alfred swallow that warning for that dependency+project in future

ambiguity and guessing and checking boxes

yes, "exotic code" came from me, but it's the sort of thing you can only quantify after spending a lot of time with a language and other people's code.
I'm not about to do code diving of every dependency and I'm not always going to be using my preferred languages.
"community enthusiasm" is another great ambiguous checkbox. also came from me, I think.
I can't measure the enthusiasm of others about a library, but exotic code or an elaborate integration is definitely to be considered but still difficult to measure.
I would leave this checkbox to the very end and if all other boxes get ticked, there shouldn't be a reason to go code diving.
If you find yourself code diving, then other checkboxes may be missing from this list (docs, but no api docs, for example)

if alfred can tick most of these boxes automatically, we should only be alerted when one can't.

gaming ADRs

if a thing is too hard or inconvenient to do and there are no consequences or checks that it has been done, it will eventually stop being done. for example: updating documentation.
and if there are checks and they are weak, it will be gamed.
I'm not a robot and I stopped aspiring to be one some time ago. I value automation and convenience and being able to measure things unambiguously. Grunt work (batches of dependencies that need The Checklist applied) and arse-covering (all the boxes were ticked!) are boring and will lead to mistakes and drift and gaming behaviour.

automate this

from this ADR it's clear we're able to come up with unambigious metrics we think are important.
we'll probably think of others as time goes on.
if we treat this ADR as something that can be run through manually, if one really has to, then put the easy to check points at the top of the groups and either quantify the ambiguous points into easily checkable boxes or push them to the bottom and have Alfred alert us to potential problems.

giorgiosironi · 2018-01-22T09:30:26Z

alfred should check my dependencies and post a warning if anything is dubious.
just a simple "this library only has a few commits. are you sure?" type message
if we wanted to get fancy we could add a button "yes, I'm sure" that makes Alfred swallow that warning for that dependency+project in future

It is probably impractical to check all dependencies listed on every build, but checks can be made against the new ones that differ from the develop/master branch. This is language-specific: composer.json, requirements.txt and so on. Of course it depends on which check to make:

running pylint or similar tools on the library clone (easy)
going back to the source repository on the right tag and extract some statistics or a license (medium, primarily because of discovering the right Git commit, and the library may not be on Git or Github anyway)
checking if the maintainers have been nice or naughty this year (difficult)

lsh-0 · 2018-01-23T00:55:34Z

It is probably impractical to check all dependencies listed on every build

sure. it could consult a project-local 'approved' list that is populated automatically by the tool or explicitly overridden when a human gives it a thumbs up.

obviously language specific, but baby steps first. I'm thinking of the Github and Pypi APIs that give you access to a lot of project information:

$ curl https://pypi.python.org/pypi/requests/json

like licence, downloads, version history and, in the python world, a (self-applied) stability status.

running pylint or similar tools on the library clone (easy)

ha! I'll hold you to that ...

checking if the maintainers have been nice or naughty this year (difficult)

yeah, difficult to measure. I've never heard of a dependency shitlist before, but I suppose they must exist, like spam blacklists

giorgiosironi · 2018-01-23T09:41:23Z

running pylint or similar tools on the library clone (easy)
ha! I'll hold you to that ...

running it is easy, whether it gives a good result it's another matter

nlisgo · 2018-01-31T16:51:50Z

adr/0007-choosing-an-open-source-component.md

+
+Using third-party software that is difficult to maintain integrated in our environment can make updating it difficult, time-consuming or risky.
+
+Using software with an inappropriate license may present a legal risk, or may risk others choosing not to use our software because of the burden the license subsequently places on them.  


nit-pick: in the UK license is the verb, licence is the noun.

we need an ADR for British vs. American English 🇬🇧 🇺🇸

+1

I used to have a yellow sticky note next to me on the wall that would remind me ;) I'm certain I've lapsed since then

https://github.com/muan/spiffing

https://twitter.com/PaulDJohnston/status/958362583580463107

nlisgo · 2018-01-31T16:54:09Z

adr/0007-choosing-an-open-source-component.md

+ - low complexity to maintain in our environment
+
+## Warning signs
+ - many open issues


I prefer many open issues without activity

nlisgo · 2018-01-31T17:10:36Z

For PHP projects composer is the de facto dependency management tool. I just found this project: composer-plugin-license-check

It has a whitelist and blacklist configuration. It also has a handy script to output the licence for all of the dependencies. This is useful because it reports on all of the dependencies, not just the immediate dependencies. This is the output of composer check-licenses for the annotations project:

Name: elife/annotations
Version: dev-develop
Licenses: MIT
Dependencies:

Name                                           Version             License       Allowed to Use?  
aws/aws-sdk-php                                3.38.4              Apache-2.0    yes              
beberlei/assert                                v2.7.11             BSD-2-Clause  yes              
crell/api-problem                              2.0                 MIT           yes              
csa/guzzle-cache-middleware                    v1.0.0              Apache-2.0    yes              
doctrine/annotations                           v1.4.0              MIT           yes              
doctrine/cache                                 v1.6.2              MIT           yes              
doctrine/instantiator                          1.0.5               MIT           yes              
doctrine/lexer                                 v1.0.1              MIT           yes              
elife/api                                      dev-master a1c7283  MIT           yes              
elife/api-client                               dev-master 560963f  MIT           yes              
elife/api-problem                              v1.1.0              MIT           yes              
elife/api-sdk                                  dev-master a2e479c  MIT           yes              
elife/api-validator                            dev-master 53f10fb  MIT           yes              
elife/bus-sdk                                  dev-master 721aad8  MIT           yes              
elife/content-negotiator                       v1.0.0              MIT           yes              
elife/logging-sdk                              dev-master 81a10f8  MIT           yes              
elife/ping                                     v1.0.0              MIT           yes              
ezyang/htmlpurifier                            v4.9.3              LGPL          yes              
firebase/php-jwt                               v5.0.0              BSD-3-Clause  yes              
giorgiosironi/eris                             0.9.0               MIT           yes              
guzzlehttp/guzzle                              6.3.0               MIT           yes              
guzzlehttp/promises                            v1.3.1              MIT           yes              
guzzlehttp/psr7                                1.4.2               MIT           yes              
jms/metadata                                   1.6.0               Apache-2.0    yes              
jms/parser-lib                                 1.0.0               Apache2       yes              
jms/serializer                                 1.9.1               Apache-2.0    yes              
justinrainbow/json-schema                      5.2.6               MIT           yes              
knplabs/console-service-provider               v2.1.0              MIT           yes              
league/commonmark                              0.17.0              BSD-3-Clause  yes              
metasyntactical/composer-plugin-license-check  v0.1.0              MIT           yes              
mindplay/composer-locator                      2.1.3               MIT           yes              
monolog/monolog                                1.23.0              MIT           yes              
mtdowling/jmespath.php                         2.4.0               MIT           yes              
myclabs/deep-copy                              1.7.0               MIT           yes              
ocramius/package-versions                      1.1.3               MIT           yes              
paragonie/random_compat                        v2.0.11             MIT           yes              
phpcollection/phpcollection                    0.5.0               Apache2       yes              
phpdocumentor/reflection-common                1.0.1               MIT           yes              
phpdocumentor/reflection-docblock              4.1.1               MIT           yes              
phpdocumentor/type-resolver                    0.4.0               MIT           yes              
phpoption/phpoption                            1.5.0               Apache2       yes              
phpspec/prophecy                               v1.7.2              MIT           yes              
phpunit/php-code-coverage                      4.0.8               BSD-3-Clause  yes              
phpunit/php-file-iterator                      1.4.2               BSD-3-Clause  yes              
phpunit/php-text-template                      1.2.1               BSD-3-Clause  yes              
phpunit/php-timer                              1.0.9               BSD-3-Clause  yes              
phpunit/php-token-stream                       2.0.1               BSD-3-Clause  yes              
phpunit/phpunit                                5.7.25              BSD-3-Clause  yes              
phpunit/phpunit-mock-objects                   3.4.4               BSD-3-Clause  yes              
pimple/pimple                                  v3.2.2              MIT           yes              
psr/container                                  1.0.0               MIT           yes              
psr/http-message                               1.0.1               MIT           yes              
psr/log                                        1.0.2               MIT           yes              
sebastian/code-unit-reverse-lookup             1.0.1               BSD-3-Clause  yes              
sebastian/comparator                           1.2.4               BSD-3-Clause  yes              
sebastian/diff                                 1.4.3               BSD-3-Clause  yes              
sebastian/environment                          2.0.0               BSD-3-Clause  yes              
sebastian/exporter                             2.0.0               BSD-3-Clause  yes              
sebastian/global-state                         1.1.1               BSD-3-Clause  yes              
sebastian/object-enumerator                    2.0.1               BSD-3-Clause  yes              
sebastian/recursion-context                    2.0.0               BSD-3-Clause  yes              
sebastian/resource-operations                  1.0.0               BSD-3-Clause  yes              
sebastian/version                              2.0.1               BSD-3-Clause  yes              
silex/silex                                    v2.2.0              MIT           yes              
symfony/browser-kit                            v3.3.13             MIT           yes              
symfony/console                                v3.3.13             MIT           yes              
symfony/debug                                  v3.4.1              MIT           yes              
symfony/dom-crawler                            v3.3.13             MIT           yes              
symfony/event-dispatcher                       v3.4.1              MIT           yes              
symfony/filesystem                             v3.4.3              MIT           yes              
symfony/http-foundation                        v3.4.1              MIT           yes              
symfony/http-kernel                            v3.4.1              MIT           yes              
symfony/inflector                              v3.3.13             MIT           yes              
symfony/polyfill-mbstring                      v1.6.0              MIT           yes              
symfony/polyfill-php70                         v1.6.0              MIT           yes              
symfony/property-access                        v3.3.13             MIT           yes              
symfony/psr-http-message-bridge                v1.0.2              MIT           yes              
symfony/routing                                v3.3.14             MIT           yes              
symfony/serializer                             v3.3.13             MIT           yes              
symfony/yaml                                   v3.3.13             MIT           yes              
webmozart/assert                               1.2.0               MIT           yes              
willdurand/negotiation                         v2.3.1              MIT           yes              
zendframework/zend-diactoros                   1.6.1               BSD-2-Clause  yes

lsh-0 · 2018-02-01T22:49:47Z

that's really great actually. I wonder if the Python world has some hidden built in like that I haven't seen before ..

seanwiseman · 2018-02-02T09:45:59Z

Just had a play around, I have been able to generate the same table for a python project using the pkg_resources library (it's in the standard library, so no additional dependencies 👍 ). If we want to go down this route I could formalize it (would not take long) and we could test it out. With thoughts being we would have a file containing allow license types for comparison.

I could also have it as an option part of the proofreader to warn of issues whilst linting. Just an idea.

lsh-0 · 2018-02-04T22:45:35Z

I prefer integrating it into the proofreader to be honest, at least for the python apps.

giorgiosironi · 2018-02-05T09:28:09Z

I have been able to generate the same table for a python project using the pkg_resources library
I prefer integrating it into the proofreader to be honest, at least for the python apps.

The only caveat is how long it takes, if it's fast enough to be included in every build (which is where proofreader is executed), 👍

davidcmoulton added 3 commits January 9, 2018 13:44

Add draft ADR for choosing an open source component

98b68bc

Add note about documentation

fbf7f4d

Style

18e785f

giorgiosironi reviewed Jan 9, 2018

View reviewed changes

davidcmoulton changed the title ~~Choosing an open source component~~ ELPP-3358 Choosing an open source component Jan 9, 2018

lsh-0 reviewed Jan 10, 2018

View reviewed changes

Respond to feedback

f8451c2

stephenwf reviewed Jan 18, 2018

View reviewed changes

nlisgo reviewed Jan 31, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ELPP-3358 Choosing an open source component #9

ELPP-3358 Choosing an open source component #9

davidcmoulton commented Jan 9, 2018 •

edited

Loading

giorgiosironi Jan 9, 2018

lsh-0 Jan 10, 2018

giorgiosironi Jan 9, 2018

lsh-0 Jan 10, 2018

giorgiosironi Jan 10, 2018

giorgiosironi Jan 10, 2018

davidcmoulton Jan 16, 2018

lsh-0 commented Jan 10, 2018

giorgiosironi commented Jan 10, 2018

gnott commented Jan 10, 2018

davidcmoulton commented Jan 18, 2018

stephenwf Jan 18, 2018

giorgiosironi Jan 18, 2018

lsh-0 commented Jan 22, 2018

giorgiosironi commented Jan 22, 2018

lsh-0 commented Jan 23, 2018 •

edited

Loading

giorgiosironi commented Jan 23, 2018

nlisgo Jan 31, 2018

giorgiosironi Jan 31, 2018

lsh-0 Feb 1, 2018

lsh-0 Feb 1, 2018

giorgiosironi Feb 2, 2018

nlisgo Jan 31, 2018

nlisgo commented Jan 31, 2018 •

edited

Loading

lsh-0 commented Feb 1, 2018

seanwiseman commented Feb 2, 2018 •

edited

Loading

lsh-0 commented Feb 4, 2018

giorgiosironi commented Feb 5, 2018


		### Licensing

		[should we specify this in terms of an explicit list of acceptable licenses, or in terms of what we need a licence to allow us to do?]


		### Well maintained

		- Are there a decent number of recent commits? This will indicate whether it's under active development. Avoid it if it's not.


		Using third-party software that is difficult to maintain integrated in our environment can make updating it difficult, time-consuming or risky.

		Using software with an inappropriate license may present a legal risk, or may risk others choosing not to use our software because of the burden the license subsequently places on them.


		## Status

		Proposed.

ELPP-3358 Choosing an open source component #9

Are you sure you want to change the base?

ELPP-3358 Choosing an open source component #9

Conversation

davidcmoulton commented Jan 9, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lsh-0 commented Jan 10, 2018

giorgiosironi commented Jan 10, 2018

gnott commented Jan 10, 2018

davidcmoulton commented Jan 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lsh-0 commented Jan 22, 2018

giorgiosironi commented Jan 22, 2018

lsh-0 commented Jan 23, 2018 • edited Loading

giorgiosironi commented Jan 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nlisgo commented Jan 31, 2018 • edited Loading

lsh-0 commented Feb 1, 2018

seanwiseman commented Feb 2, 2018 • edited Loading

lsh-0 commented Feb 4, 2018

giorgiosironi commented Feb 5, 2018

davidcmoulton commented Jan 9, 2018 •

edited

Loading

lsh-0 commented Jan 23, 2018 •

edited

Loading

nlisgo commented Jan 31, 2018 •

edited

Loading

seanwiseman commented Feb 2, 2018 •

edited

Loading