Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider GPL → CC #296

Closed
stranak opened this issue May 17, 2016 · 10 comments
Closed

Consider GPL → CC #296

stranak opened this issue May 17, 2016 · 10 comments

Comments

@stranak
Copy link

stranak commented May 17, 2016

GPL license for data is confusing and hard to interpret at best. Please consider changing the license to CC-BY-SA, or at least adding it as an alternative.

This concerns following treebanks:

  • Catalan
  • Danish
  • Dutch
  • (Finnish)
  • Polish
  • Spanish AnCora
@spyysalo
Copy link
Member

spyysalo commented Sep 8, 2016

Good idea, I would like to see this change. @stranak: would you mind writing a few sentences with links explaining why CC licenses are better for data that could be used to explain this issue to the corpus creators?

(Not sure why Finnish is listed, both Finnish treebanks are CC..?)

@stranak
Copy link
Author

stranak commented Sep 8, 2016

I contacted proper professionals, colleagues from CLARIN-D Legal Helpdesk (http://www.clarin-d.de/en/help/legal-information-platform) and they were kind enough to promise to formulate the argument properly. Paweł Kamocki should write his reply here shortly.

(I have no idea about the Finnish in the list, although it was clearly me who put it in. Maybe some previous versions of treebanks used GPL? Maybe just my error, please disregard it.)

@pkamocki
Copy link

pkamocki commented Sep 8, 2016

Well, technically you can license non-software works under GPL. What is really impossible is to do it the other way round (license software under a data licence like CC). Licensing data or articles, or other non-software works under GPL, however, is not a good idea, and for several reasons:

  • the GPL is unnecessarily complex for non-software works. Please note that the text of the licence uses words such as « The Program » (to refer to the subject matter of the license, even though it is defined as any copyright-protected work…), « source code », « object code », « system libraries » etc. It also contains provisions about patent claims;
  • in fact, GPL is generally drafted in a very complex way (try to read a paragraph!) and is often criticised for that. Its power comes from its popularity and widespread use. This is true for software (where the use of GPL is indeed popular and widespread), but not for non-software);
  • in relation to the previous argument, it is difficult to assess GPL’s compatibility with CC licenses, even though the CC itself declares that the copyleft requirements in GPL and CC SA are compatible (meet the minimum requirements for compatibility): https://wiki.creativecommons.org/wiki/ShareAlike_compatibility_analysis:_GPL
  • perhaps most importantly: licensing data is not only about copyright, but also about the sui generis database right. CC 4.0 speaks specifically about Copyright and Similar Rights (with the sui generis database right specifically mentioned as an example), GPL only says that: « “Copyright” also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. » It is therefore not clear if it covers the database right (I can think of arguments for and against this assumption).
  • the Free Software foundation itself doesn’t seem to recommend the use of GPL for non-software works (not that their recommendations for non-software works should be followed…). This is what is written on their website (https://www.gnu.org/licenses/licenses.en.html):

Licenses for Other Types of Works
We believe that published software and documentation should be free software and free documentation. We recommend making all sorts of educational and reference works free also, using free documentation licenses such as the GNU Free Documentation License(GNU FDL).
For essays of opinion and scientific papers, we recommend either the Creative Commons Attribution-NoDerivs 3.0 United States License, or the simple “verbatim copying only” license stated above.
We don't take the position that artistic or entertainment works must be free, but if you want to make one free, we recommend the Free Art License.

Finally, please keep in mind that in case of doubt (e.g. is an .xml file "software" or "data"?), dual (or multiple) licensing might be a good solution, as in fact it creates an alternative for the user (allowing him to chose which of the licenses he wants to comply with).

Hope that helps!

@dan-zeman
Copy link
Member

I think Finnish is there because FTB has a dual license. We could probably just ignore the other one.

@pkamocki
Copy link

pkamocki commented Sep 8, 2016

Yes, if there are several licenses, you can chose the one you prefer and comply with it.

@dan-zeman
Copy link
Member

Thanks @pkamocki for all the insights. I have now tried to contact the people who I believe have the power to allow a CC license for Alpino, AnCora, DDT and Składnica. Asked them for that, and referred them to this discussion. So let's hope they find it a good idea, too.

@spyysalo
Copy link
Member

@pkamocki: thank you for the detailed argument!

@dan-zeman : did you hear anything back?

@dan-zeman
Copy link
Member

Partially. I got permission to relicense Danish and Dutch, and I have already modified these two. I got a reply from Poland but they are not sure whether they need to acquire a permission further up the chain, so that is on hold. No response from Spain so far (Spanish + Catalan).

@spyysalo
Copy link
Member

As there is no recent activity, I'll go ahead and close this. If anyone is willing to continue the discussion with the copyright holders of the treebanks that remain GPL-licensed (Catalan, Faroese, Galician-TreeGal, Polish, and Spanish-AnCora), feel free to reopen.

@dan-zeman
Copy link
Member

Galician-TreeGal is actually a different case, they use LGPLLR. Since that license's name says it's for language resources, it hopefully does not have the issues that GNU GPL has.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants