Sunday, December 9, 2007

Crystal clear, or clear as mud: a lack of clarity in OSS licensing

If you want to read the "short form" of the essay below, please go to http://www.palamida.com/node/511.




Within the body of this document, I outline a reasonable set of guidelines that can be implemented to insure reasonable adherance to clear licensing policies for OSS.




In the management of various projects for Palamida, I have had the benefit of reviewing the source code and compiled files for tens of thousands of open source projects. A common issue comes up with great frequency, one which seems to underpin the usability of open source software. A surprisingly high number of open source software projects are unclear or incorrect with their license references. This point has two principle sides. What is the responsibility on the maintainer of a project to make licensing terms clear, and what licensing terms govern a user when such terms are ambiguous, unclear or incorrect?


For the sake of this discussion, any reference to a maintainer refers to the party or parties responsible for the licensing associated with an OSS project. A developer is the user of the same OSS.



From another perspective, a development team works for a year to build a cool project that solves a business problem. They like some ideas of the GPLv3 and choose to support it on the web page. The maintainer of the code repository never put the license in the code tree, but they opened the project up to the world. Third parties now download the code, modify it, and it becomes the core of the next big thing, and starts selling. However, the source never contained a license. Who is correct?

The issue of ambiguous licensing got me thinking of some basic ideas to create clarity with the specific license associations to any given application, version and release for open source software. In the end, software licensing is the way that a maintainer/developer of a given project governs how the project is incorporated into the OSS community. The licensing creates the framework for the evolution and growth of the project, and should reasonably express the wishes of the creator of the project for recognition, financial consideration, and other basic points.

Let's discuss the following OSS licensing conditions:

1. Contradictory
2. Unclear and ambiguous
3. Nonexistent



Contradictory licensing - internal (conflicting language within license file)

Note: A funny situation that I saw was to include the "heading" from the GPLv2 license, or the BSD license, while using the text from yet another license. I am sure that in some way, the developer/maintainer may have considered "reviewing" one license, with the intent of modifying it to become a custom license, or it could have been a mistake. In either case, imagine the situation where, on a quick review of the COPYING file in the source, you see the beginning of the BSD license, and choose to move forward, without realizing that a large part of the GPLv3 license was appended to the bottom.

In the case where someone mistakenly blends both the GPLv3 and the BSD, they may have created a license that is contradictory in terms. It is my belief that when such a situation exists, while the license itself is internally contradictory, if it was clear, then the obligation falls to the user. In the case of contradictory licensing, a user should disallow themselves from using the software in question, since it may be impossible to actually adhere to the contradictory terms of use. Therefore, using software with contradictory licensing terms may represent an intention to not comply with the licensing terms. If it is impossible to comply with the terms, but compliance is mandatory for use, then a developer should avoid such software.

Contradictory licensing presents a real problem for developers. A developer may have already chosen to use a project as core to a proprietary project. A great deal of time and effort may have been invested by the time that the licensing issues arose. The intent to use source governed by impossibly contradictory terms, I would believe, is less to do with intentional circumvention of licensing, and more to do with a lack of understanding of licensing that is poorly crafted and unclear. As such, the user/developer made a choice to use OSS with a vague understanding of the usage terms. The maintainer of the same project may have attempted to modify the license in order to refine his vision for usage of the project. The maintainer probably does not realize that the license legally renders the code unusable. In the spirit of OSS usage, I believe that it is the goal of the community to publicly disclose these licenses, and in the event that a developer has started using such code, it is in the best interest of both the maintainer and the user to get together, rationalize the licensing, and publicly disclose the mistake, just as if it was a bug.


Contradictory licensing - external (multiple license files or references)

I have seen numerous examples of an inconsistency between what is disclosed on the project home page, in the compiled code, and in the source tree. A great deal of these inconsistencies became apparent as the GPLv3 license came out. Some projects intentionally did not want to support the GPLv3, others publicly wanted to release under GPLv3. In researching for GPLv3, links from the project page would take a user to the GPLv3 license page at FSF. The compiled code had the GPLv2 license text in many cases, but contained links to the GPLv3. The source would contain an unpredictable array of references either to explicit license text at the developer's site, the text embedded within a file in the source, or links to third party sites, like FSF. What is interesting is that the license references in many cases deviated from both the stated licensing on the home page, the licensing references in the readme file of the compiled code, and in some cases was even incorrect internally. (Keep in mind that I was looking at versions pending release in cases of unclear licensing - I was not always inspecting the source for the binaries currently available).

A much more common contradiction that we see all the time is the case where the GPLv2 license was included with the source, the GPLv2, "or later" is referenced in the Readme.txt file for the compiled project, and the COPYING file contains the GPLv3 text. In the case that multiple valid references to multiple and valid use licenses exist, which license governs? In the event where there are multiple licenses that by themselves are valid, but compared to one another, are contradictory, which license governs? Whose responsibility is it to deal with externally incompatible licensing, the user or the maintainer? Can the user "selectively" choose to use the license of their preference, since multiple choices are presented in parallel?


I believe that in the event where multiple choices are presented with terms and conditions of use, with no "obvious bias" towards either, the developer has been given the opportunity to select the terms of use from the distinct options given, and the maintainer inadvertently, or intentionally, released software under multiple parallel licenses. Clearly, once this is discovered, this creates issues with the maintenance of distinct code tree, like when a developer started work on the BSD version of a project, made that available, to which someone else distributed changes and so on. At the same time, the developer is trying to maintain what he thought was a single GPLv3 tree.

In this case, the problem lies with the development community, but the fault lies with the maintainer. However, in the spirit of OSS, when we see this situation where multiple parallel licensing choices exist, we must reach out to the maintainer immediately in order to prevent conflicting simultaneous development paths to start within the community. Within OSS, we realize that while a mistake may be the fault of the maintainer, it is the responsibility of the community to keep OSS development clear and on a common path.


Unclear or ambiguous terms -

The fact that terms or provisions are unclear does not necessarily mean they are unenforceable, it just means one cannot have reliable expectations about what rights or restrictions exist as a result of those terms or provisions. The lack of reliable expectations goes against the entire purpose of having written agreements, so presumably written agreements with unclear terms will be weeded out by a natural selection-type of process: the more times the unclear provisions don't do what the drafter wants them to do, the less likely a rational drafter will use similar terms in future agreements.

My personal opinion is that the project maintainer has the ultimate responsibility to: 1) carefully consider which license gives users/developers the desired rights and abilities; 2) choose a license that is easy for users/developers to understand; 3) make sure the project distribution complies with the license requirements; and 4) make the fact that a particular license governs a project obvious.

If a maintainer does not do these things, I don't see how he can complain later if he can't enforce some right in the license that he didn't care enough about to explain clearly.

It was surprising and sometimes quite frustrating to see all of the open source project pages that either had no mention of a license or only referred to a general license, like "GPL."

As an aside, the GPL seems to answer this parallel question. Right from the Terms and Conditions of the GPLv2 (para. 9) and with slight modification, the GPLv3 (para. 14):
"Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation."





Nonexistent license

I think the user has some responsibility in the sense that a user needs to consider whether a project without a license or a license that is not obvious or difficult to interpret makes a product not worth using.

From a business sense, the clearer and more complete a license is, the better. This means rights, responsibilities and remedies are clear and predictable, which is usually what business and investors like. It is interesting that the GPLv2 and now GPLv3 are so popular, because I think some of the provisions are quite confusing, particularly to those unfamiliar with how software and code are actually built and interact. The fact is, though, both of those licenses are fairly exhaustive in terms of what and how they govern, and I think there is enough certainty there for developers and maintainers to choose to use them. So much discussion exists online about the terms and provisions, that even if someone does not understand everything completely, that person can do a search and find endless analysis and comparisons that can lead to a reasonable understanding. This availability of analysis and discussion is a great benefit of Richard Stallman's and the FSF's polarizing evangelism of free software.

Is it fair to assume that because no license exists, the code is free, or public domain? I would reasonably guess that even if a license could not be found, a developer/user could not reasonably assume that code in question was free. This is like walking by a jewelry store whose front door is unlocked, and "assuming" that everything within it is free.

A reasonable user of open source software needs to understand that the understaffed nature of the development of open source software depends on the honor and trust within the community for it to function. We cannot assume because a developer posted code after a long day, but neglected to include a license, that it is free. While the developer may have a difficult argument to make if such code was made available without restriction, in the spirit of OSS, unless a developer/user verified that this "found" code is in fact public domain, the developer/user could be stealing someone's hard work.

From a legal perspective, the nature of copyright in the United States is such that as soon as the creative "work" (code) is "fixed in a tangible medium of expression" (on a disk or storage medium) it is protected by US copyright law. In order for a creative work to be in the public domain it must either be declared so by its author or the creative work's copyright term must be expired. Currently this means the life of the author plus 70 years.



Summary

Going forward, I am sure that the "market" will drive which OSS projects succeed, and which ones become obsolete, due to risks associated with unclear, contradictory and nonexistent licenses. In the interim, I would like to suggest a common sense "rule of thumb" for qualifying if a project is licensed in a way clear enough that it can be used:

Ernie's Clear OSS Licensing Guidelines

1. Project must have a distinct homepage that is not a repository. Domains are cheap and user pages are free in many cases. This provides a place where the maintainer can manage a freeform data exchange with users, and is a site unique from any repository and its goals.

2. Maintainer needs to provide a URL to the license(s) used. I suggest http:///license.html. Under this file, identify the applications, versions and releases, all hyperlinked. Within the hyperlinks, allow a user to follow distinct hyperlinks to the specific license text. Name the URL for the license text using the name of the core license, like http:///licenses/license-Apache_License_Version_2.0.html

3. License text must reference file from which it derived by URL, like http://www.opensource.org/licenses/apache2.0.php

4. Project homepage MUST provide a link called LICENSES that goes to the main license page, described above.

5. Each download file MUST be compressed with a file called license.txt and license-.txt, along with a hash fileof the text license file.

6. License.txt needs to described the licensing hierarchy for licensing per application/version/release .

7. license-.txt includes the explicit text of the license, and includes pointers to where the SAME license can be found on the website, within the source code (by the same name), and in the binary

8. license-.md5 MUST be included to allow absolute verification that the user has the license that was intended to be included with the specific release

9. Unless the maintainer has a Juris Doctorate and specializes in software licensing law, stick with OSI licenses, or use them as the CORE. In the event that the maintainer uses the OSI license, congratulations. In the event that a maintainer chose to reinvent the wheel, identify such. In the license URL, change the URL to read as follows:
- OSI license: license-.txt
- modified OSI license: license-Modified_.txt (this requires identification of the core license that was modified)
- license newly developed by maintainer or third party: license-custom-.txt

10. Finally, a license should be basic, and easy enough to understand that a layperson could understand the idea of it. In the event that a maintainer customizes licenses, or chooses one that is available, make sure it is easy to understand. Think of it this way, if you can't explain this to your mother, you probably can't explain this to a judge any easier.

In the end, a license for the use of open source software is an agreement between a maintainer/developer who created the project, and a user/developer who plans to use it. The agreement represents the rules of a relationship.

- Make it clear and easy to understand.
- Put multiple copies of the same thing in many places
- Register your licensing for a release on your own site (described above), and register your release and its license at http://gpl3.palamida.com:8080/ , even if it is not GPLv3.
- Take responsibility for making your desires for use clear enough that others can comply without prohibiting use
- For user/developers, if you see something that is unclear, contradictory, or not there, communicate, let the maintainer know. Cooperation by the community will ease the creation of simple and easy to understand licensing that can be verified.

Ultimately, it is the strength of the community that makes OSS succeed. If projects don't meet the basic measure of clear licensing as I define above, insist that the maintainer implement changes to support clear licensing, or do not support the OSS project by using it.




Ernest Park

Kevin Howard

The Research Team


No comments: