Tools for Investigating Copyright Infringement

Tuesday 8 May 2007 by Bradley M. Kuhn

Nearly all software developers know that software is covered by copyright. Many know that copyright covers the expression of an idea fixed in a medium (such as a series of bytes), and that the copyright rules govern the copying, modifying and distributing of the work. However, only a very few have considered the questions that arise when trying to determine if one work infringes the copyright of another.

Indeed, in the world of software freedom, copyright is seen as a system we have little choice but to tolerate. Many Free Software developers dislike the copyright system we have, so it is little surprise that developers want to spend minimal time thinking about it. Nevertheless, the copyright system is the foremost legal framework that governs software1, and we have to live within it for the moment.

My fellow developers have asked me for years what constitute copyright infringement. In turn, for years, I have asked the lawyers I worked with to give me guidelines to pass on to the Free Software development community. I've discovered that it's difficult to adequately describe the nature of copyright infringement to software developers. While it is easy to give pathological examples of obvious infringement (such as taking someone's work, removing their copyright notices and distributing it as your own), it quickly becomes difficult to give definitive answers in many real world examples whether some particular activity constitutes infringement.

In fact, in nearly every GPL enforcement cases that I've worked on in my career, the fact that infringement had occurred was never in dispute. The typical GPL violator started with a work under GPL, made some modifications to a small portion of the codebase, and then distributed the whole work in binary form only. It is virtually impossible to act in that way and still not infringe the original copyright.

Usually, the cases of “hazy” copyright infringement come up the other way around: when a Free Software program is accused of infringing the copyright of some proprietary work. The most famous accusation of this nature came from Darl McBride and his colleagues at SCO, who claimed that something called “Linux” infringed his company's rights. We now know that there was no copyright infringement (BTW, whether McBride meant to accuse the GNU/Linux operating system or the kernel named Linux, we'll never actually know). However, the SCO situation educated the Free Software community that we must strive to answer quickly and definitively when such accusations arise. The burden of proof is usually on the accuser, but being able to make a preemptive response to even the hint of an allegation is always advantageous when fighting FUD in the court of public opinion.

Finally, issues of “would-be” infringement detection come up for companies during due diligence work. Ideally, there should be an easy way for companies to confirm which parts of their systems are derivatives of Free Software systems, which would make compliance with licenses easy. A few proprietary software companies provide this service; however there should be readily available Free Software tools (just as there should be for all tasks one might want to perform with a computer).

It is not so easy to create such tools. Copyright infringement is not trivially defined; in fact, most non-trivial situations require a significant amount of both technical and legal judgement. Software tools cannot make a legal conclusion regarding copyright infringement. Rather, successful tools will guide an expert's analysis of a situation. Such systems will immediately identify the rarely-found obvious indications of infringement, bring to the forefront facts that need an exercise of judgement, and leave everything else in the background.

In this multi-part series of blog entries, I will discuss the state of the art in these Free Software systems for infringement analysis and what plans our community should make for the creation Free systems that address this problem.

1 Copyright is the legal system that non-lawyers usually identify most readily as governing software, but the patent system (unfortunately) also governs software in many countries, and many non-Free Software licenses (and a few of the stranger Free Software ones) also operate under contract law as well as copyright law. Trade secrets are often involved with software as well. Nevertheless, in the Software Freedom world, copyright is the legal system of primary attention on a daily basis.

Posted on Tuesday 8 May 2007 at 11:30 by Bradley M. Kuhn.

Submit comments on this post to <>.

Creative Commons License This website and all documents on it are licensed under a Creative Commons Attribution-Share Alike 3.0 United States License .

#include <std/disclaimer.h>
use Standard::Disclaimer;
from standard import disclaimer
SELECT full_text FROM standard WHERE type = 'disclaimer';

Both previously and presently, I have been employed by and/or done work for various organizations that also have views on Free, Libre, and Open Source Software. As should be blatantly obvious, this is my website, not theirs, so please do not assume views and opinions here belong to any such organization. Since I do co-own with my wife, it may not be so obvious that these aren't her views and opinions, either.

— bkuhn

ebb is a service mark of Bradley M. Kuhn.

Bradley M. Kuhn <>