Friday, February 20, 2009

Copyright in Databases

I'm going to have more to say about data, databases, and intellectual property rights in the coming months. This longish post provides a basic primer on how U.S. copyright law applies to databases.

A. Copyright

Copyright attaches to an original work of authorship that has been embodied in a fixed form. The “work” to which copyright attaches can be the structure of the database or a relatively small part of a database, including an individual data element, such as a photograph. It is therefore possible for a database to contain multiple overlapping copyrighted works or elements. To the extent that a database owner has a copyright, or multiple copyrights, in elements of a database, the rights apply only to those copyrighted elements. The rights are to reproduce, publicly distribute or communicate, publicly display, publicly perform, and prepare adaptations or derivative works.

1. Standards for obtaining copyright


a. Originality


Copyright protects only an author’s “original” expression, which means expression independently created by the author that reflects a minimal spark of creativity. A database owner may have a copyright in the database structure or in the user interface with the database, whether that be a report form or an electronic display of field names associated with data. The key is whether the judgments made by the person(s) selecting and arranging the data require the exercise of sufficient discretion to make the selection or arrangement “original.” In Feist Publications, Inc. v. Rural Telephone Service Company, the United States Supreme Court held that a white pages telephone directory could not be copyrighted. The data—the telephone numbers and addresses—were “facts” which were not original because they had no “author.” Also, the selection and arrangement of the facts did not meet the originality requirement because the decision to order the entries alphabetically by name did not reflect the “minimal spark” of creativity needed.


As a practical matter, this originality standard prevents copyright from applying to complete databases – i.e. those that list all instances of a particular phenomenon – that are arranged in an unoriginal manner, such as alphabetically or by numeric value. However, courts have held that incomplete databases that reflect original selection and arrangement of data, such as a guide to the “best” restaurants in a city, are copyrightable in their selection and arrangement. Such a copyright would prohibit another from copying and posting such a guide on the Internet without permission. However, because the copyright would be limited to that particular selection and arrangement of restaurants, a user could use such a database as a reference for creating a different selection and arrangement of restaurants without violating the copyright owner’s copyright.


Copyright is also limited by the merger doctrine, which appears in many database disputes. If there are only a small set of practical choices for expressing an idea, the law holds that the idea and expression merge and the result is that there is no legal liability for using the expression.

Under these principles, metadata is copyrightable only if it reflects an author’s original expression. For example, a collection of simple bibliographic metadata with fields named “author,” “title,” “date of publication,” would not be sufficiently original to be copyrightable. More complex selections and arrangements may cross the line of originality. Finally, to the extent that software is used in a databases, software is protectable as a “literary work.” A discussion of copyright in executable code is beyond the scope of this entry.


b. Fixation


A work must also be “fixed” in any medium permitting the work to be perceived, reproduced, or otherwise communicated for a period of more than a transitory duration. The structure and arrangement of a database may be fixed any time that it is written down or implemented. For works created after January 1, 1978 in the United States, exclusive rights under copyright shower down upon the creator at the moment of fixation.


2. The Duration of Copyright


Under international treaties, copyright must last for at least the life of the author plus 50 years. Some countries, including the United States, have extended the length to the life of the author plus 70 years. Under U.S. law, if a work was made as a “work made for hire,” such as a work created by an employee within the scope of employment, the copyright lasts for 120 years from creation if the work is unpublished or 95 years from the date of publication.


3. Ownership and Transfer of Copyright


Copyright is owned initially by the author of the work. If the work is jointly produced by two or more authors, such as a copyrightable database compiled by two or more scholars, each has a legal interest in the copyright. When a work is produced by an employee, ownership differs by country. In the United States, the employer is treated as the author under the “work made for hire” doctrine and the employee has no rights in the resulting work. Elsewhere, the employee is treated as the author and retains certain moral rights in the work while the employer receives the economic rights in the work. Copyrights may be licensed or transferred. A non-exclusive license, or permission, may be granted orally or even by implication. A transfer or an exclusive license must be done in writing and signed by the copyright owner. Outside of the United States, some or all of the author’s moral rights cannot be transferred or terminated by agreement. The law on this issue varies by jurisdiction.


4. The Copyright Owner’s Rights


The rights of a copyright owner are similar throughout the world although the terminology differs as do the limitations and exceptions to these rights.


a. Reproduction


As the word “copyright” implies, the owner controls the right to reproduce the work in copies. The reproduction right covers both exact duplicates of a work and works that are “substantially similar” to the copyrighted work when it can be shown that the alleged copyist had access to the copyrighted work. In the United States, some courts have extended this right to cover even a temporary copy of a copyrighted work stored in a computer’s random access memory (“RAM”).

b. Public Distribution, Performance, Display or Communication

The United States divides the rights to express the work to the public into rights to distribute copies, display a copy, or publicly perform the work. In other parts of the world, these are subsumed within a right to communicate the work to the public.


Within the United States, courts have given the distribution right a broad reading. Some courts, including the appeals court in the Napster case, have held that a download of a file from a server connected to the internet is both a reproduction by the person requesting the file and a distribution by the owner of the machine that sends the file. The right of public performance applies whenever the copyrighted work can be listened to or watched by members of the public at large or a subset of the public larger than a family unit or circle of friends. Similarly, the display right covers works that can be viewed at home over a computer network as long as the work is accessible to the public at large or a subset of the public.


c. Right of Adaptation, Modification or Right to Prepare Derivative Works


A separate copyright arises with respect to modifications or adaptations of a copyrighted work so long as these modifications or adaptations are themselves original. This separate copyright applies only to these changes. The copyright owner has the right to control such adaptations unless a statutory provision, such as fair use, applies.


5. Theories of Secondary Liability


Those who build or operate databases also have to be aware that copyright law holds liable certain parties that enable or assist others in infringing copyright. In the United States, these theories are known as contributory infringement or vicarious infringement.


a. Contributory Infringement


Contributory copyright infringement requires proof that a third party intended to assist a copyright infringer in that activity. This intent can be shown when one supplies a means of infringement with the intent to induce another to infringe or with knowledge that the recipient will infringe. This principle is limited by the so-called Sony doctrine, by which one who supplies a service or technology that enables infringement, such as a VCR or photocopier, will be deemed not to have knowledge of infringement or intent to induce infringement so long as the service or technology is capable of substantial non-infringing uses.

Two examples illustrate the operation of this rule. In A&M Records, Inc. v. Napster, Inc., the court of appeals held that peer-to-peer file sharing is infringing but that Napster’s database system for connecting users for peer-to-peer file transfers was capable of substantial non-infringing uses and so it was entitled to rely on the Sony doctrine. (Napster was held liable on other grounds.) In contrast, in MGM Studios, Inc. v. Grokster, Ltd., the Supreme Court held that Grokster was liable for inducing users to infringe by specifically advertising its database service as a substitute for Napster’s.


b. Vicarious Liability for Copyright Infringement


Vicarious liability in the United States will apply whenever (1) one has control or supervisory power over the direct infringer’s infringing conduct and (2) one receives a direct financial benefit from the infringing conduct. In the Napster case, the court held that Napster had control over its users because it could refuse them access to the Napster server and, pursuant to the Terms of Service Agreements entered into with users, could terminate access if infringing conduct was discovered. Other courts have required a greater showing of actual control over the infringing conduct.

Similarly, a direct financial benefit is not limited to a share of the infringer’s profits. The Napster court held that Napster received a direct financial benefit from infringing file trading because users’ ability to obtain infringing audio files drew them to use Napster’s database. Additionally, Napster could potentially receive a financial benefit from having attracted a larger user base to the service.


6. Limitations and Exceptions


Copyrights’ limitations and exceptions vary by jurisdiction. In the United States, the broad “fair use” provision is a fact-specific balancing test that permits certain uses of copyrighted works without permission. Fair use is accompanied by some specific statutory limitations that cover, for example, certain uses in the classroom use and certain uses by libraries. The factors to consider for fair use are: (1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential market for or value of the copyrighted work. The fact that a work is unpublished shall not itself bar a finding of fair use if such finding is made upon consideration of all the above factors.


Countries whose copyright law follows that of the United Kingdom, a more limited “fair dealing” provision enumerates specific exceptions to copyright. In Europe, Japan, and elsewhere, the limitations and exceptions are specified legislatively and cover some private copying and some research or educational uses.


7. Remedies and Penalties


In general, a copyright owner can seek an injunction against one who is either a direct or secondary infringer of copyright. The monetary consequences of infringement differ by jurisdiction. In the United States, the copyright owner may choose between actual or statutory damages. Actual damages cover the copyright owner’s lost profits as well as a right to the infringer’s profits derived from infringement. The range for statutory damages is $750 to $30,000 per copyrighted work infringed. If infringement is found to have been willful, the range increases to $150,000. The amount of statutory damages in a specific case is determined by the jury. There is a safe harbor from statutory damages for non-profit educational institutions if an employee reproduces a copyrighted work with a good faith belief that such reproduction is a fair use.


A separate safe harbor scheme applies to online service providers when their database is comprised of information stored at the direction of their users. An example of such a database would be YouTube’s video sharing database. The service provider is immune from monetary liability unless the provider has knowledge of infringement or has control over the infringer and receives a direct financial benefit from infringement. The safe harbor is contingent on a number of requirements, including that the provider have a copyright policy that terminates repeat infringers, that the provider comply with a notice-and-takedown procedure, and that the provider have an agent designated to receive notices of copyright infringement.


Case Examples


In cases arising after the Feist decision, the courts have faithfully applied the core holding that facts are in the public domain and free from copyright even when substantial investments are made to gather such facts. There has been more variation in the characterization of some kinds of data as facts and in application of the modicum-of-creativity standard to the selections and arrangements in database structures.


On the question of when data is copyrightable, a court of appeals found copyrightable expression in the “Red Book” listing of used car valuations. The defendant had copied these valuations into its database, asserting that it was merely copying unprotected factual information. The court disagreed, likening the valuations to expressive opinions and finding a modicum of originality in these. In addition, the selection and arrangement of the data, which included a division of the market into geographic regions, mileage adjustments in 5,000-mile increments, a selection of optional features for inclusion, entitled the plaintiff to a thin copyright in the database structure.


Subsequently, the same court found that the prices for futures contracts traded on the New York Mercantile Exchange (NYMEX) probably were not expressive data even though a committee makes some judgments in the setting of these prices. The court concluded that even if such price data were expressive, the merger doctrine applied because there was no other practicable way of expressing the idea other than through a numerical value and a rival was free to copy price data from NYMEX’s database without copyright liability.


Finally, where data are comprised of arbitrary numbers used as codes, the courts have split. One court of appeals has held that an automobile parts manufacturer owns no copyright in its parts numbers, which are generated by application of a numbering system that the company created. In contrast, another court of appeals has held that the American Dental Association owns a copyright in its codes for dental procedures.


On the question of copyright in database structures, a court of appeals found that the structure of a yellow pages directory including listing of Chinese restaurants was entitled to a “thin” copyright, but that copyright was not infringed by a rival database that included 1,500 of the listings because the rival had not copied the plaintiff’s data structure. Similarly, a different court of appeals acknowledged that although a yellow pages directory was copyrightable as a compilation, a rival did not violate that copyright by copying the name, address, telephone number, business type, and unit of advertisement purchased for each listing in the original publisher’s directory. Finally, a database of real estate tax assessments that arranged the data collected by the assessor into 456 fields grouped into 34 categories was sufficiently original to be copyrightable.

Copyright and Linking

Periodically, I am asked to explain some feature of copyright law. When I do this in an email, I'm going to make it a practice of also posting the explanation here in case it's of use to others.

I was asked about what the copyright issues are with hyperlinks on the web. So, in US law, generally there is no copyright issue with linking because the link causes the person clicking on it to load a copy of the web site, but the person who posts the link is not making a copy, or displaying a copy, or distributing a copy so there's no copyright issue for the person posting the link. (And therefore, there's generally no legal theory that a site can use to stop someone from linking to their site, even if it's a so-called "deep" link or an in-line link). See Perfect 10 v. Amazon, Inc., 487 F.3d 701 (9th Cir. 2007).

The one exception is if the target site has material that infringes copyright on it. In that case, even though the person linking to the site is not directly infringing, they could be liable on the theory of indirect infringement - helping someone else to infringe copyright.

The one law that specifically deals with this is Section 512(d) of the Copyright Act, which creates a "safe harbor" for search engines and others who link to "online locations" with copyright infringing materials. As long as the search engine removes the link after receiving notice of the infringing materials, the search engine does not owe the copyright owner any money.

For more information, see the Chilling Effects site.

YouTube Tests Creative Commons Licenses

Very exciting news, as reported by Eric Steurer on the CC Blog:

Eric Steuer, February 12th, 2009

youtubelogo2YouTube just made an incredibly exciting announcement: it’s testing an option that gives video owners the ability to allow downloads and share their work under Creative Commons licenses. The test is being launched with a handful of partners, including Stanford, Duke, UC Berkeley, UCLA, and UCTV.

We are always looking for ways to make it easier for you to find, watch, and share videos. Many of you have told us that you wanted to take your favorite videos offline. So we’ve started working with a few partners who want their videos shared universally and even enjoyed away from an Internet connection.

Many video creators on YouTube want their work to be seen far and wide. They don’t mind sharing their work, provided that they get the proper credit. Using Creative Commons licenses, we’re giving our partners and community more choices to make that happen. Creative Commons licenses permit people to reuse downloaded content under certain conditions.

Visit YouTube’s blog for information. And if you’re are a partner who wants to participate, fill out the YouTube Downloads - Partner Interest form.

Wednesday, February 04, 2009

Renewed Attack on Open Access in Congress

As Peter Suber reports, yesterday Rep. John Conyers (D-MI) re-introduced the Fair Copyright in Research Works Act. This year it's H.R. 801 (last year it was H.R. 6845), and co-sponsored by Steve Cohen (D-TN), Trent Franks (R-AZ), Darrell Issa (R-CA), and Robert Wexler (D-FL).

The bill language has not changed. Neither has the fact that there is no reasonable basis in law or in fact to support this legislation. The NIH Public Access Policy is working. Although publishers have made vague assertions, claims that there are legal problems with the NIH policy have been discredited. Similarly, there is no evidence to support the policy - with its allowance of an unduly long 12 month delay - that scholarly communication in the biomedical sciences has been harmed.

Indeed, it's really time to turn this conversation around. The United States' economy needs more than increased consumer spending to recover. We need to innovate, and innovation in basic research happens quicker and in more diverse directions in an open, networked environment. In a word, research should be linkable.

Wanna see? Do you have breast cancer or is there a woman in your life who does? Want to know more about the statistical risks? Thanks to the NIH Public Access Policy, I can simply suggest that you click here because your tax dollars supported the study.

Now that's just using the freedom to link to help quickly point you to an article or scientific letter you might want to read. But the real power of linkable science is that scientists would be able to use their computers to study the network of links to find otherwise hidden patterns in the research and to find otherwise hidden linkages between results in related but distinct fields of research or even in different disciplines. It's the power to process links that has made Google the leading search engine for the web. So why can't web technologies do for scientists what they do for web searchers looking to buy electronics or shoes? Because scientific information other than NIH funded research articles is not generally linkable!

So the path to linkable science and the innovations that will follow from processing the links is to release journal articles and associated data from the paywalls that surround them - either immediately through supply-side funded journals or after a short delay for subscription-based journals.

So, Chairman Conyers, with all due respect, the policy question is not whether Congress should act to deny scientists and taxpayers access to research funded by NIH, but rather, why should NIH-funded research articles be the only articles reporting federally-funded research that scientists and taxpayers like me can link to?