Do Software Developers Understand Open Source Licenses?

Daniel A. Almeida and Gail C. Murphy (University of British Columbia), Greg Wilson (Rangle.io), and Mike Hoye (Mozilla Corporation)

Abstract. Software provided under open source licenses is widely used, from forming high-profile stand-alone applications (e.g., Mozilla Firefox) to being embedded in commercial offerings (e.g., network routers). Despite the high frequency of use of open source licenses, there has been little work about whether software developers understand the open source licenses they use. To help fill the gap of whether or not developers understand the open source licenses they use, we conducted a survey that posed development scenarios involving three popular open source licenses (GNU GPL 3.0, GNU LGPL 3.0 and MPL 2.0) both alone and in combination.

A paper (pre-print) based on the results of this project was presented (slides) at the 25th IEEE International Conference on Program Comprehension (ICPC 2017) and won the ACM SIGSOFT Distinguished Paper Award.

Acknowledgements. We thank the participants of the survey for the time they spent and their insightful comments. We would especially like to think the legal expert who surveyed as our oracle for survey answers. This work was supported in part by NSERC and in part by the Institute for Computing, Information and Cognitive Systems (ICICS) at the University of British Columbia. We also thank Michalis Famelis for helpful comments on an early draft of the paper.

You can find a pdf copy of the survey at survey and the data package (including the aggregate results and the participants' comments) at data.

Survey

The survey consisted of six demographic questions, seven hypothetical software development scenarios, and four open-ended questions. The software development scenarios included a total of 45 cases. Each case required the participant to answer yes, no, or unsure (in which case an extra textbox appeared asking for further clarification). For scenarios involving the combination of licenses, the participant answered yes, no, or unsure for each possible combination of 2 licenses (e.g., GPL with GPL, GPL with LGPL, GPL with MPL, LGPL with GPL, LGPL with LGPL, and so on).

Results

The survey was started 825 times and completed 375 times. Below you can find the compact form of all scenarios (except Scenario 5*) and their aggregate results. To be able to analyze the survey responses, we needed to determine the correct answer for each case. We addressed this challenge by recruiting an American intellectual property lawyer as our legal expert. This expert has over a decade's specialization in patent reform, open source licensing, and related issues. The black stars indicate the answers of the legal expert.

* Scenario 5 was removed from our analysis once we noted a significant difference in the assumptions made by our legal expert and our participants, which led us to believe that the scenario was not stated clearly enough.

Scenario 1

John has been working on ToDoApp, his own personal task management application. ToDoApp is going to be a desktop-based application that will be used exclusively by John on his own computer. To make sure he does not lose any of his very special tasks, John is planning to use a lightweight library called LightDB to persist ToDoApp’s data.

If LightDB is distributed under {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0}, would John be allowed to use it as part of ToDoApp?

Scenario 2

Having used ToDoApp for three months, John realized how much his productivity has improved. To help other people manager their tasks as efficiently as well, John has decided to make ToDoApp available as open source.

If LightDB, the lightweight library used to persist ToDoApp’s data is distributed under {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0}, would John be allowed to make ToDoApp available under {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0}?

Scenario 3

After the success of the open source version of ToDoApp, John has decided to create a brand new commercial task management application: TaskPro. TaskPro is going to be built from scratch and use LightDB as a lightweight library to persist data.

If LightDB, is distributed under {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0}, would John be allowed to make TaskPro commercially available under each of the {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0} licenses?

Scenario 4

As the lead developer of a new product at GreatSoftware Inc., Laura decided to use an existing authentication library she found on the Web called SafeAuth. She realizes that SafeAuth could be improved using a stronger cryptographic algorithm when storing users’ information. The product is going to be released under a commercial software license, but Laura would like to release the improved version of SafeAuth as open source.

If SafeAuth, is distributed under {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0}, would Laura and her team be allowed to release the improved version of SafeAuth under each of the {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0} licenses?

Scenario 5

Laura who works for GreatSoftware Inc. has changed the open version of SafeAuth found on the Web and added a new, stronger cryptographic algorithm to it. Despite Laura’s intentions to release the modified version of SafeAuth as open source, her manager sees a very strong competitive advantage for their products and decides not to release the modified version as open source.

Considering that the new product is going to be distributed under a commercial license, if SafeAuth is distributed under the {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0}, can Laura and her team use the modified version as part of their new product?

Scenario 6

Shaoqing believes there are unhappy users out there willing to pay for a premium email client. To get to market faster, she decided to use an open source implementation of the Simple Mail Transfer Protocol (SMTP).

If the SMTP implementation is released under {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0}, would Shaoqing be allowed to fork the SMTP project and change the fork’s license to the {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0} license in order to use it in her commercial e-mail client?

Scenario 7

Shaoqing has been trying to optimize the way her email client handles old e-mails. Browsing on the Web, she found a fairly sophisticated implementation of a compression algorithm on a software developer’s blog that could be used on archived emails. The algorithm implementation has hundreds of lines of code and does not include an explicit license, but there is a copyright notice on the blog that states “All Rights Reserved”.

If Shaoqing used the source code she found on the blog in her e-mail client, would be allowed to distribute the e-mail client commercially under the {GNU GPL 3.0, GNU LGPL 3.0, MPL 2.0} license?