What’s the deal with Open Source, Open Data, and Open Standards licenses?

Photo by Tim Mossholder on Unsplash

In navigating the landscape of technology and digital innovation, we often find ourselves having to deal with complex concepts that cross between the technical, policy and legal domains. We hear words like “open source”, “open data”, and “open standards” thrown around, each with its attached notions of transparency, accessibility, and collaboration. However, these concepts aren’t as interchangeable as they might seem, and it’s crucial to understand the different rules that govern each.

Bear in mind too that the term “IP” or “intellectual property” is really a catch-all term for a bag full of different types of rights, including copyright (which can apply to software, as a so-called literary work), database rights, patents, trade marks, and more.

As someone who has served as an Open Source & Open Standards Strategy Director, worked for the Open Data Institute, and sat on the Open Standards Board for the UK Government, I’ve seen first-hand the nuances that define and differentiate these domains. This understanding is crucial as we cannot simply transfer licenses or IP guidelines from one sphere to another, owing to their unique features and scopes. This post was prompted by a few examples I’ve seen this year where people are either misunderstanding the differences between these domains, or conflating them.

Having said all that: I am not a lawyer. This is not legal advice. I did not say this. I am not here.

Open Source licenses: Code Reuse

The term “open source” generally refers to a type of software whose source code is accessible to the public, allowing anyone to inspect, modify, and distribute it. The crux of an open-source ecosystem is its license – the legal mechanism that dictates how the software can be used, modified, and shared.

There’s a common misconception that open source means “free” or “without restrictions”. However, each open source project is governed by one or more specific licenses, each of which lays out certain rules and permissions. There is a wide spectrum of licenses, conferring different rights and responsibilities. Some licenses afford broad permissions to developers, like the MIT License, which allows for almost unrestricted freedom for developers in terms of their use and re-use of the software, while other licenses, like the GNU General Public License (GPL), focus more of protection of the rights of users of the software, including so-called “copy-left” requirements, and mandate that any derivative work also be open sourced under the same license.

The Open Source Initiative (OSI), a global non-profit that champions open source, has created a widely recognized definition of Open Source and compiled a list of OSI-approved licenses.

Open Data licenses: API and Commercial Use

Open data is more than just freely accessible information; it’s about establishing clear guidelines on the use and distribution of this data. Whether  data is provided as a database, a data dump, or via an Application Programming Interface (API), an open data license becomes vital to protect the data provider’s rights and inform users about their rights and obligations.

Consider an API as a bridge connecting data providers and data consumers. By using an API, developers can access and use data provided by others in their own applications. However, the specifics of how this data can be utilized can vary greatly depending on the provider’s open data license. Some providers might allow free and unlimited use of their data, while others might impose certain restrictions, such as non-commercial use only or mandatory attribution.

For instance, consider a scenario where the data is used for commercial purposes. An open data license will clearly articulate whether such use is permissible, eliminating ambiguity and potential misuse. This is why open data is often licensed under a Creative Commons license. Creative Commons offers a suite of licenses catering to different needs – from the most open (CC0, effectively placing the data in the public domain) to licenses that require attribution, limit commercial use, or insist on sharing alike (the data and derivatives remain under the same license).

In essence, using an open data license for APIs is providing a guide to data consumers, outlining how the data can and can’t be used downstream.

(Creative Commons licenses are used for a broad range of non-software data, including music, films, and photographs.)

Open Standards licenses: The Interplay with IP and Patent Policies

Open standards are the glue that binds our global digital infrastructure, enabling interoperability across diverse systems and platforms. However, establishing an open standard goes beyond making technical specifications openly available; it also involves a careful management of intellectual property (IP) rights, particularly in the form of patent policies that license standard essential patents on a royalty-free basis.

Most Standards Developing Organizations (SDOs) have in place an IP policy that works in conjunction with open standards licenses. The rationale is simple yet profound – to promote wide adoption of the standard and prevent legal disputes that could arise due to patent infringement.

For example, when a group of contributors come together to develop a standard, those contributors might own underlying patents that would be necessary in order to implement the technology being standardized. To avoid future legal complications, the SDO’s IP policy may  require contributors to disclose any patents in order to make it clear which patents are being licensed to implementers and under what terms. Ideally contributors agree to license these essential patents to their fellow contributors and down-stream implementers on a royalty-free (i.e. no payment for the license itself) basis. The royalty-free patent license is further bolstered when it includes reciprocity clauses, which require down-stream implementers of the standard to license any of their own essential patents back to the contributors and other implementers. This ensures that anyone implementing the standard won’t be blindsided by unexpected patent licensing fees, thus promoting broader adoption, innovation and open source implementation.

Take the World Wide Web Consortium (W3C), for instance. W3C’s Patent Policy mandates royalty-free licensing for any patents held by participants that are necessary for implementing W3C standards. This policy works in tandem with W3C’s open standard license, and its open process, fostering a truly “open” and inclusive environment. Alternatively, the Community Specification license, developed by people at the Linux Foundation, provides a light weight framework for developing independent open standards projects.

Navigating Between Domains

A critical point to remember is that while these domains all promote openness, they each have distinct rules and cannot be transferred or applied to another. A license suitable for open source software will rarely be suitable for open data and vice versa. Similarly, restrictions imposed by a license in one domain (for instance, a Creative Commons license on open data) won’t apply to another domain (like a software product using that data). Each domain has unique scope, characteristics and concerns that its licenses and IP laws are designed to address.

It’s essential to understand these differences and the individual IP nuances of each domain when venturing into the world of open source, open data, and open standards. This knowledge can help you use, contribute to, and benefit from these open domains more effectively and responsibly and to avoid misunderstanding.

To Summarize:

  • If you have a software product that consumes data from an open data source that is licensed under cc-by-sa, it does not obligate you to release your software under cc-by-sa, because cc-by-sa is not a software license: it limits itself to the content being licensed and does not impose obligations on the tools used to process that content.
  • If you write software that exposes an API and release that software under the Apache-2 license, it does not automatically confer that license to the data you’re exposing, because Apache-2 is not a data license. You need to explicitly sign-post what data license you are using (e.g. cc0).
  • If you develop an open standard which can have multiple implementations, releasing this standard under only an open source or open data license leaves the intellectual property rights murky at best. Specifying a patent license policy in addition to copyright and trademark licenses will make the licensing far more clear for implementers and increase your chances for adoption. The Community Specification License can provide a lightweight framework for developing open standards, or consider doing your standards work in an SDO with some track record (and size) such as IETF or W3C.

Thank you for coming to my TED talk. I would like to thank Neil Brown, Jory Burson and Terence Eden for their help in reviewing and providing helpful feedback on this post. They are also not providing you legal advice.

Liked this post? Follow this blog to get more. 

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.