-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why are you re-licensing ggplot2? #4236
Comments
Here's my perspective. But first, I'd like to emphasize that I had nothing to do with this decision, and didn't even know until the issues were filed. I also have absolutely no financial interest in any company that could benefit from relicensing. In theory, the GPL protects you from cheaters that use the free code without contributing back to the community. In practice, this doesn't work, and the people actually hurt by the GPL are other free software developers that have good intention but that for some reason or another need to work in a licensing environment that is incompatible with the GPL. The problem with the idea that the GPL protects you is that companies that are willing to cheat the system aren't particularly disturbed by violating the GPL. There are plenty of high profile examples, most recently Tesla violating the licensing requirements of the Linux kernel. And these cases never go anywhere. You can sue the company, and it costs you a lot of money, and usually the best-case outcome is that the company just posts a current dump of their code base somewhere, so they're in compliance temporarily, and then they go right at it again violating the license. The only people that actually take the GPL seriously are other free software developers. And the most common problem they run into is GPL incompatibility with other licenses. The GPL is even incompatible with itself! (Version 2 vs version 3.) So, if you're developing some R package licensed under GPL 3, are you free to incorporate some ggplot2 code? Probably not. So, are you even allowed to write any ggplot2 extension package that is not GPL2 licensed? I'd argue that any ggplot2 extension package is a ggplot2 derivative, because you basically can't write a new geom, stat, theme, or coord without first copying some ggplot2 code. So, to arbitrarily pick one example, the ggdist package is in my opinion entirely out of compliance, because it is GPL3 licensed and obviously a derivative of ggplot2. However, I also think this is a silly notion, and we should allow extension package developers to license their code as GPL3, or MIT, or whatever they choose. So, the ggplot2 license needs to be more permissive. MIT is a good choice. I've recently used MIT for all my packages that are not clearly ggplot2 derivatives, and I've grudgingly chosen GPL2 for packages such as ggtext that are derivative in nature. (Compare to the license of gridtext. The two packages are companion packages and were developed in parallel.) For these reasons, I strongly support the relicensing effort. |
We're working on a blog post that explain this is more detail, but it's pretty much what it says on the can — the tidyverse currently uses a mix of licenses (GPL-2, GPL-3, GPL >= 2, MIT) and that causes needless confusion. Since it's quite a lot of work to relicense all the packages, we only want to do it once, so we opted for the most permissive license to enable the widest possible usage. While this licenses certainly allows you to make a proprietary fork, and this is something that I worried about when I first created ggplot2 (hence the GPL-2 license), I don't think it's possible to build a business around it — it's very hard to sell something that's 95% the same as a product that you can get completely for free. @clauswilke Personally, I don't believe that extending ggplot2 in that way forms a derivative work — I don't think merely using ggplot2 or building on its exported extension interface implies that you need to be compatible with its license. (This is the same reason why an R package can have any license, not just GPL). That said, a lawyer might argue differently, and we do want to make it as clear as possible that you're free to build on ggplot2 in any way that you want. |
@hadley To be clear, I completely agree that the licensing provisions of the GPL end at package boundaries. Since R is not compiled, the code only gets combined at the enduser level, and the GPL doesn't speak to that. It only cares about distribution. I can distribute my non-GPL package using your GPL package without ever distributing your package myself. What is less clear, though, is the situation where people take ggplot2 code and modify it for their own purposes. E.g., I copy the code of In any case, I don't actually want to worry about these kinds of things, which is why I very much support less restrictive licensing agreements. |
Thanks for the quick replies. I look forward to the blog post that explains this in more detail. @hadley Here's one question you might consider addressing in the blog post:
@clauswilke Could I please ask if you might elaborate on this comment?
Could you offer a specific example that might be instructive for people who are looking for jobs and might not know the consequences of joining a company? |
@slowkow I think making it about joining a company versus not is missing the point. The situation can be as simple as two open source projects not being able to exchange code snippets due to incompatible licenses. Here is an example of a case I've been involved with, which is somewhere between those two extremes: clauswilke/PeptideBuilder#9 (comment) |
@clauswilke along those lines, I feel like I have recently benefited from the permissive licensing of a few other projects (particularly in the js community) and it’s nice to pay it forward. @slowkow we can only really speak to why we made the decision to change. It is a lot of work (particularly if you’re not on GitHub) so it’s not surprising that most projects stay from the beginning. Also note that the RStudio IDE uses a CLA so it can be relicensed easily. We chose not to do that for the tidyverse to make it clear that the packages are community “owned”. |
I have had this issue with Plotnine which qualifies as a derivative of ggplot2 and so had to be licensed under GPLv2. The Python ecosystem in which it exists is dominated by BSD 3-Clause and MIT licences and it has made some anxious about working on top of it, or simply using it in a commercial environment. Also, as GPLv2 is quite involved ("lawerly") it is less precisely understood; this has some users who care not to violate licenses to sometimes incorrectly assume incompatibility. So re-licensing is mighty welcome. |
I'd also like to emphasize the problem of incompatibility. I've seen this became a problem several times e.g. r-lib/covr#256 (comment). Actually, I myself had experienced a case where I needed to reinvent stringr functions just because its current license (GPLv2) was incompatible with GPLv3 packages. I don't think it needs to be cared, but it's not easy to convince the companies that have a stricter view on licensing issues (and I understand they try to avoid the potential risks even if it's very small). Migrating to permissive license let us avoid such unnecessary confusions (at least to some extent), so I hope relicensing will success. |
I'm still looking forward to the blog post, and hope that there will be some discussion of costs and benefits. It goes without saying that every decision has costs and benefits. However, enumerating and weighing costs and benefits might help to compare the the ratio of their weights and to eventually arrive at a rational decision — or at least to reveal the beliefs and values of the decision maker. Here, I want to offer some questions for your consideration. Several people have mentioned incompatibility as a reason for changing from GPLv2 to MIT. You've described it as "unnecessary confusions", "lawerly", something you "don't want to worry" about, and "needless confusion."
Incompatibility may come as an unwelcome surprise to those who use GPL before understanding it.
https://www.gnu.org/philosophy/free-software-even-more-important.html We might consider whether the cost of incompatibility is worthwhile in exchange for the benefits of GPL.
|
I think we should be careful dismissing the GPL on the basis of some "cheaters will cheat anyway" argument. There are many counter-examples. I really dislike that for profit entities can take MIT code and use that for profit without any requirement to give back to the OSS community. Even the possibility of a lawsuit is enough to modify behavior, however hypothetical it might be. Yes, GPL is more restrictive, and a little more annoying. For the GPL2 vs 3 compatibility issues, the code can be licensed under both. |
@brodieG practically, I think the major impact of threat of a lawsuit is to push companies towards permissively licensed products. But regardless of what you believe the impact is, I hope that you agree that it's up to the primary developers of the individual package to choose the license that they feel is most appropriate. |
Which is why it is a shame when GPL licensed software moves to MIT, thus making it easier for companies to avoid the requirements of GPL. And yes, in general I agree copyright holders are entitled to license however they feel appropriate under the constraints of whatever software licenses they themselves are using. Although again, a shame that it makes it easier for for-profit entities to take OSS code without giving back. As an aside, I personally think (though there is definitely some gray here), that R packages used with GNU R should be GPL-2|3. Of course, it is possible to use R packages with other S/R implementations that are not GPL that may not have such requirements. I realize not everyone (including R-core) agrees with this interpretation. |
To be fair, if being GPL was one of the reason that a contributor decided to contribute, they has right to disagree. There might be such kind of contribution that ggplot2 couldn't have received if it started with a different license. On the other hand, I guess there are some developers who gave up ggplot2 because of its being GPL. For example, as @has2k1 pointed, porting to another language is a derivative work, and probably it might not be easy depending on the type of the language or the ecosystem. This means ggplot2 might lost some possible contributions e.g. feedback about nicer API that we won't be able to come up with. I now feel it's wrong to call this as "needless confusion." (Btw, here's one more example I happened to find. If dplyr kept uisng GPL, Apache Arrow might not have support for dplyr: https://twitter.com/wesmckinn/status/393122387610198016.) |
For those of you looking to learn more about the reasoning that goes into re-licensing, consider reading some of the following articles. After reading them, I've become aware of a possible cost that might be worth mentioning:
I'm not aware of compelling reasons for favoring MIT versus BSD. In favor of BSD:The Whys and Hows of Licensing Scientific Code
Relicensing yt from GPLv3 to BSD
In favor of AGPL:Open source licensing and why we're changing Plausible to the AGPL license
|
Please let's be precise in our language. ggplot2 is currently licensed under GPL2. The following statements apply:
The widespread incompatibility of different GPL-type licenses with each other has long convinced me that something is fundamentally broken in that ecosystem. |
This chart helps to see the compatibility between GPL versions: https://www.gnu.org/licenses/gpl-faq.html#AllCompatibility Have you considered what @brodieG said about licensing with "GPL2 or later" or "GPL 2|3"? |
Yup. My first two points correspond to the first two gray boxes in the first column of the matrix, and the third point is not part of their matrix but follows from the fact that the AGPL is a more restrictive version of the GPL3. |
Licensing code under "GPL2 or later" is an absurd proposition. It de-facto gives the FSF the unilateral and exclusive right to relicense my code as they see fit. I'd rather give everybody this right, and thus I usually choose MIT when I can. |
What about GPL2|GPL3? I also don't think it's completely absurd to trust the FSF to some extent, though do agree that it's probably safer to state explicitly which license to support. |
I too feel so, but MIT is slightly clearer than BSD as it's not clear which BSD license it refers to when we just say "BSD license," though the differences are small (mainly about attribution). c.f. https://opensource.stackexchange.com/a/582 |
@clauswilke I think this is a little bit like saying that introducing API breaking changes in between two major releases of software (e.g. 2.0 to 3.0) is a sign of something fundamentally broken. It isn't. There were 15+ years between GPL-2 and GPL-3. It's not surprising that some things were learned in that time that could be improved on. Given the whole purpose of these license is to restrict use it also isn't surprising that they end up legally incompatible. Expecting compatibility under these circumstances and assuming something is fundamentally broken when that doesn't happen would be like assuming something is fundamentally broken with the R ecosystem because code written for R3.x doesn't always work with R2.x. Granted the incompatibility is more extreme with the license, but the point stands: incompatibility across decade spanning versions is not a sign of brokeness. Sure, there are no incompatibilities with unrestricted software, but then there are no requirements from for-profit entities that benefit from the software to give back (and most won't unless they have specific profit motives for doing so). I understand people may be okay with the trade-off but there I feel a lot of people are scared off from GPL due to general claims of "badness" and "virulence", or as we have here "fundamental brokenness". I think a lot of people (not you) hear all of this, don't have time to figure out what the true cost / benefit balance is, and go the easy route. And I think that's a shame. |
Adding a question that I don't think is in this thread:
Originally asked in the roxygen relicensing thread: r-lib/roxygen2#1163 |
My current not 100% certain belief is that R packages that are distributed and run on GNU R should be GPL. It should be possible however to run non-GPL packages on other R/S implementations that are not GPL (assuming the involved licenses allow it). I'll elaborate more on why I believe this later. In re the R-core position you link, I don't think it says much of anything in re whether packages should be GPL or not, especially because:
There may be more direct statements of support elsewhere, but I don't think this one is dispositive. Of course, if R-core is not going to pursue license action against MIT licensed packages that are distributed to run on GNU-R this may be academic (or maybe not, I'm not sure whether FSF would have standing to pursue if R-core is not interested), but I for one would still like to know for sure what is the correct GPL compliant answer. For this reason I'm curious about what reasons you may have for believing it is legally fine to license packages intended to run on GNU R with non GPL licenses. If your answer is "the linked R Foundation statement + CRAN allowing non-GPL packages" that's fine (not a legal judgment). But if you have additional reasons I am curious to hear them as GPL FAQ is not super clear for the specific case of R packages distributed via CRAN. There is also the specter of the Oracle vs. Google API ruling, which could have profound implications for R (IIRC TIBCO owns the rights to S, and presumably the API as well). |
@brodieG They need to be GPL compatible, not GPL. That’s the position of the FSF in comparable situations. Of course (again, according to the link) this still means that redistributing self-contained R-based application requires the redistributable bundle to be licensed under GPL, and this encompasses all packages used in it (the packages themselves can be redistributed under a different license, but as soon as they’re used in such a bundle, they additionally fall under GPL). So far for the FSF interpretation. It is my understanding that RStudio have a different opinion on that (but I won’t try to interpret theirs in their stead). |
@klmr Exactly. @brodieG To figure this out, we need to consider by what mechanism the GPL works. It works via copyright. So for the GPL to transmit from one piece of code to the other, there needs to be some credible copyright claim. Let's make a concrete example. Here is a piece of code that I release under the MIT license: set.seed(1234)
words_shuffled <- c(
"fox", "over", "dog", "brown", "quick",
"The", "the", "jumps", "lazy"
)
print(paste(sample(words_shuffled), collapse = " ")) If somebody wanted to argue that I cannot release this code under the MIT license, they would have to make a credible copyright claim on this code. It is widely accepted practice (Oracle vs. Google non-withstanding) that the authors of a programming language don't have copyright claims over code written in the language. So I don't see how R core would have copyright claims over this code snippet. And if they were to try to assert copyright, I think the result would be an immediate exodus of developers from the R ecosystem, since this is not a tenable position. Now what happens if somebody wants to run this code on any specific R implementation, or wants to distribute a byte-compiled version of this code, is a different question. Arguably, the GPL transmits to compiled R packages. However, this has no effect on the original piece of code I wrote. You can copy this code, run it on your machine, and it's GPL then (since the MIT license allows for that), but my original version remains MIT. Since the ggplot project is only in the business of distributing original source, it is free to choose any license that allows for relicensing under the GPL when required, e.g. when distributing compiled packages. |
I'll also say that while I'm all in favor of pursuing GPL violations in cases where additional restrictions have been placed on the code (e.g., Tesla distributing modified Linux binaries but not releasing their sources), I find it distasteful to use tenuous copyright claims to try to constrain developers who would like to use less restrictive licensing (e.g., arguing that because there's one line of GPL code in thousands of lines of otherwise original code the original code must not be released using a license that imposes fewer constraints than the GPL does). |
Thanks for your insights. I'll circle back later when I have time to process them. But I'd like to make sure of one thing is clear:
No one here is doing such a thing that I'm aware of. My point is not that I should have the right to block the re-licensing of a project based on a minor contribution. Rather, I'm trying to make it clear to people who come by and read this why I think GPL is a good thing and it that it would be better if these projects were not re-licensed, and secondarily, try to get people's views on why they think the GPL license on GNU R does not prevent licensing packages MIT. The latter because I presumed this would have been discussed internally by a well resourced organization making a major licensing decision, especially given that covr was re-licensed to GPL-3 in the past due to concerns about it importing a GPL-3 licensed package. |
Caveat: I am not a lawyer, and this is my personal opinion, not RStudio's position. If you want to understand open source licenses you have to start with copyright, and in the US, copyright grants the copyright holder six exclusive rights to any creative work. Three of the rights apply to software:
These are the defaults: if there is no license, you're not allowed to copy, modify, or share the code with others. Fortunately, while these rights are strict by default, the copyright holder can choose to relax them if they want. The goal of an open source license is relax these rights so that you can copy, modify, and, share code as long as you obey certain conditions. So to expand on @clauswilke's example, since he is the copyright holder, he is allowed to control when you can copy, modify, and share that code. However, since you are probably going to run that code, you will need an R interpreter. And in order to copy R on to your computer, you'll need to comply with the conditions that R core has chosen. But that doesn't imply that R core has any role in the license you chose for you code. (And certainly the FSF has no role here, except as the copyright holder of the GPL license). |
I would like to address this sentiment, expressed above:
A company "giving back" to the OSS community can take many many forms. For example:
In my experience, the practical effect of a very strict license on a piece of OSS software is only that companies turn toward proprietary or looser-licensed alternatives. I have never in my life seen a company "turn fully open source" in order to comply with a strict license for a piece of software they want to integrate. I myself work for a giant company (easily discoverable on LinkedIn, I suspect) (and which I consider to be a real force for good in the world right now) that uses tons of OSS products, like most other small and large companies in the world. I engage heavily with the OSS community because it benefits both us and the community. I find that a very common position in the business world, FWIW. |
Yes, I think I'm coming around to your viewpoint on this. So long as the "whole program" is free to be distributed GPL it's okay. This addresses the weirdness of it's okay on non-GNU R implementations, but not on GNU R that I've been stuck on. So MIT is okay, but e.g. the original BSD license is not. Thanks for humoring me by discussing this aspect of the problem. I'm still sad this re-licensing is taking place.
I think it just reduces the costs to the company in question. If the company does not turn around and give back it does not make the production of OSS more sustainable. There are not many OSS projects that can afford to pay good developers, or even good issue-triage support. The whole thing only goes around because there are enough programmers that like artists will produce whether you pay them or not. Obviously Rstudio contributes a lot to open source, but they are not a typical company. In re: your (@kenahoo) company, I'm not on linked in so it's actually not easy for me to figure out (I get a big paywall page just an FYI as this might not be apparent to you as a linked in user).
My issue with this is the distribution of the societal surplus associated with the free software. Company X that uses OSS with no obligation to give back gains surplus Z from use of the software. How much will they give back? A rare few will genuinely give back. Some will provide some patches, which probably amount to a small fraction of the surplus they generate from using the software. Most will just take. So should we really just allow the most that will just take to do that for the hope that we will be given back some crumbs of the surplus via upstream contributed patches because it is in their interest to do so? And maybe the answer is yes, on the whole more OSS software will be produced this way. Or maybe it's no, and by having better software that can only be modified and distributed under OSS licenses more people contribute because the OSS software is actually better than the proprietary software. Obviously I lean to the latter. |
One note in @hadley earlier comment:
Note in particular "I don't think ... that you need to be compatible with its license". This I still don't agree to, at least in the case where the new software is being run with the GPL software as opposed to another implementation with a more permissive license. |
@brodieG wrote:
I'm not sure how you justify the "just" part of that statement. Clearly it also reduces the amount of money some proprietary SW company is taking in, and shifts the overall marketshare balance in favor of OSS.
RStudio is in a whole different part of the conversation, almost irrelevant to the discussion about how companies use OSS: they have chosen to make their business model all about OSS. It is the entire purpose of their company. Of course they're going to contribute to the OSS tools in their space, it's their whole business plan. Whereas my company has chosen to make its business model about renewable energy, and they need to find the best and most cost-effective ways to support that model. I've already outlined why engaging properly with the open-source community is IMO the best way to make that happen.
For a company like RStudio, I don't want to put words into their corporate mouth, but I assume they would be extremely happy if/when tools like, say, the Tidyverse ecosystem gets deeply embedded into the processes of the software world, even without considering any money or patches changing hands. It brings them lots of business opportunities in the form of consulting, training, conference revenue, better connections across the industry, better understanding of customers, and so on and so on.
Yes, this 100% is how I think about it too - which licensing structure is better in the long run for the goals of OSS [sideline - I do not think that the ultimate end goal should be to maximize OSS marketshare; for me the end goal is that developers have high-quality tools at their disposal, which they can examine and modify, all the way up and down their stack. OSS methods are the best way I know of to get that to happen.], and when I play out the simulations in my head, based on the data I know about and my own personal experience over the last 25 years of being in various parts of the software industry, less-restrictive licenses end up being much better at achieving them. |
@hadley said:
#4232 (comment)
I asked:
I'm afraid my comment was not visible, so I'm posting a new issue here, where I hope ggplot2 users and contributors might learn more about the plan to re-license. I would like to please ask if someone might be willing to provide some context regarding the reason for the change of license. Is that OK?
ggplot2 is available under the terms of the Free Software Foundation’s GNU General Public License (GPL). https://github.com/tidyverse/ggplot2/blob/master/LICENSE
That's the same license that the R Project uses:
For those of you who are not familiar with software licenses, you might consider reading this recent post from a company that re-licensed their code from from the MIT to a newer licensing scheme called GNU Affero General Public License V3 (AGPLv3).
For example, one reason to re-license code from GPL to MIT might be to give permission for a corporation that is interested to take the ggplot2 (and other tidyverse and r-lib) code and use it in closed-source proprietary products.
The text was updated successfully, but these errors were encountered: