In an opinion by Judge Leval, the United States Court of Appeals for the Second Circuit has upheld Google's digitization program of full text copying of books. Expanding on its decision last year in Authors Guild, Inc. v. HathiTrust, 755 F.3d 87 (2d Cir. 2014), the court held that Google's program was highly transformative and unlikely to substitute for any of the original works and, thus, was a fair use under the Copyright Act.

Background

In HathiTrust, the Second Circuit held that it was fair use for research universities to digitize full copies of millions of works to enable users to determine the number of times a particular word appears or to provide full replacement copies of the work to persons with disabilities. In doing so, the court held that HathiTrust's text-searchable database is a "quintessentially transformative" use in that "it does something more than repackage or republish the original copyrighted work" by "add[ing] something new, with a further purpose or different character, altering the first with new expression, meaning or message." Users do not see any of the work's actual text or images, and authors do not "write with the purpose of enabling text searches of their books." The court further explained, in rejecting the district court's analysis, that "a use does not become transformative by making an 'invaluable contribution to the progress of science and cultivation of the arts.'" "Added value or utility is not the test: a transformative work is one that serves a new and different function from the original work and is not a substitute for it."

Unlike the program at issue in HathiTrust, the Google Library Project involved more than just counting words. In agreements with several of the world's largest research libraries, Google created digital scans of over 20 million books that the libraries chose from their collections, extracted machine-readable text, and indexed the extracted text. The "vast majority" of the books Google digitized were non-fiction and "most" are out-of-print works. Google keeps the digitized books on its own servers and makes each book's information available to the submitting library, which agrees to use the digital copies only for non-infringing uses.

The Google Project enables public users to "enter search words or terms of their own choice, receiving in response a list of all books in the database in which those terms appear, as well as the number of times the term appears in each book." In addition, a "brief description" of each book is provided, together with "some rudimentary additional information, including a list of the words and terms that appear with most frequency in the book," and "sometimes" the response provides links to buy the book outline and identifies libraries where the book can be found. As the court noted, "this identifying information instantaneously supplied would otherwise not be obtainable in lifetimes of searching."

The Google Project also displays a maximum of three snippets in each book containing the searched-for term. Each snippet "is a horizontal segment comprising ordinarily an eighth of a page," which for many book formats includes three lines of text. Searching for a term multiple times "will reveal the same three snippets" and "does not allow a searcher to increase the number of snippets revealed." Although searchers "can view more than three snippets of a book by entering additional searches for different terms," Google "makes permanently unavailable for snippet view one snippet on each page and one complete page out of every ten—a process Google calls 'blacklisting.'"

In addition, Google's search engine "makes possible new forms of research, known as 'text mining' and 'data mining.'" These provide the "frequency of word and phrase usage over centuries" and permit "users to discern fluctuations of interest in a particular subject over time and space by showing increases and decreases in the frequency of reference and usage in different periods and different linguistic regions." They also allow researchers "to comb over the tens of millions of books Google has scanned in order to examine 'word frequencies, syntactic patterns, and thematic markers' and to derive information on how nomenclature, linguistic usage, and literary style have changed over time."

The Court's Fair Use Analysis

With respect to the first of the four nonexclusive factors to be considered in determining whether a particular use of a copyrighted work is fair use (17 U.S.C. § 107)—the "purpose and character" of the defendant's use—the court emphasized that the more transformative a use or purpose is, the smaller the chance that the use "will serve as a substitute for the original or its plausible derivatives." Applying HathiTrust's definition of transformativeness—that a transformative use "communicates something new and different from the original or expands its utility"—the court held that Google's copying, digitization, and snippet display of scanned books serve a transformative purpose and strongly favor fair use because the project "augments public knowledge by making available information about Plaintiffs' books." (Emphasis by the court.) Critically, the court also held that the snippets displayed by Google were designed to show only enough context for searchers to evaluate whether to acquire the work. Lastly, as it had in prior cases, the court held that the fact that Google was a for-profit entity did not weigh heavily on the first factor because of the highly transformative nature of Google's use.

Turning to the second factor—the "nature of the copyrighted work"—the court stated that the expression in both factual and non-fiction works is entitled to protection and that, as in HathiTrust, the second factor should not be viewed in isolation and favored fair use. This is so, "not because Plaintiffs' works are factual, but because the secondary use transformatively provides valuable information about the original, rather than replicating protected expression in a manner that provides a meaningful substitute for the original."

Applying HathiTrust's definition of transformativeness, the court held that Google's copying, digitization, and snippet display of scanned books serve a transformative purpose and strongly favor fair use because the project "augments public knowledge by making available information about Plaintiffs' books."

With respect to the third factor—"the amount and substantiality of the portion used in relation to the copyrighted work as a whole"— the court held that Google satisfied the test with respect to the "search function" because, as in HathiTrust, it was "reasonably necessary" to "make use of the entirety of the works in order to enable the full-text search function." "If Google copied less than the totality of the originals, its search function could not advise searchers reliably whether their searched term appears in a book (or how many times)." As for Google's provision of "snippet views," what matters "is not so much 'the amount and substantiality of the portion used' in making a copy, but rather the amount and substantiality of what is thereby made accessible to a public for which it may serve as a competing substitute." (Emphasis by the court.) Here, the court held, the "fragmentary and scattered nature of the snippets revealed, even after a determined, assiduous, time-consuming search, results in a revelation that is not 'substantial,' even if it includes an aggregate 16% of the text of the book." The court did caution, however, that "[i]f snippet view could be used to reveal a coherent block amounting to 16% of a book, that would raise a very different question beyond the scope of our inquiry."

Finally, the court stated that the fourth factor—"the effect of the use upon the potential market for or value of the copyrighted work"—"focuses on whether the copy brings to the marketplace a competing substitute for the original, or its derivative, so as to deprive the rights holder of significant revenues because of the likelihood that potential purchasers may opt to acquire the copy in preference to the original." With respect to the search function, the court noted HathiTrust's finding that the fourth factor supported a finding of fair use "because the ability to search the text of the book to determine whether it includes selected words 'does not serve as a substitute for the books that are being searched.'" The court also found that "at least as snippet view is presently constructed," it, too, does not act as a competing substitute for the original works. "Snippet view, at best and after a large commitment of manpower, produces discontinuous, tiny fragments, amounting in the aggregate to no more than 16% of a book. This does not threaten the rights holders with any significant harm to the value of their copyrights or diminish their harvest of copyright revenue." Although the court "recognize[d] that the snippet function can cause some loss of sales," it held that it was insufficient to serve as a meaningful substitute for the copyrighted work:

There are surely instances in which a searcher's need for access to a text will be satisfied by the snippet view, resulting in either the loss of a sale to that searcher, or reduction of demand on libraries for that title, which might have resulted in libraries purchasing additional copies. But the possibility, or even the probability or certainty, of some loss of sales does not suffice to make the copy an effectively competing substitute that would tilt the weighty fourth factor in favor of the rights holder in the original. There must be a meaningful or significant effect "upon the potential market for or value of the copyrighted work." 17 U.S.C. § 107(4).

The court further noted that because an author's copyright "does not extend to the facts communicated by his book," but only to the manner of expression, if a database user takes only facts, not expression, then even this sort of lost sale "would not change the taking of an unprotected fact into a copyright infringement."

Next, the court disposed of plaintiffs' remaining arguments regarding harm, holding that there is no "derivative right in the application of search and snippet view functions to their works":

Google safeguards from public view the digitized copies it makes and allows access only to the extent of permitting the public to search for the very limited information accessible through the search function and snippet view. The program does not allow access in any substantial way to a book's expressive content. Nothing in the statutory definition of a derivative work, or of the logic that underlies it, suggests that the author of an original work enjoys an exclusive derivative right to supply information about that work of the sort communicated by Google's search functions.

The court also rejected plaintiffs' argument that "there exist, or would have existed, paid licensing markets in digitized works" because, unlike the Copyright Clearance Center licensing regime for making photocopies of journal articles, the Google Project permits users to obtain only "limited data about the contents of the book, without allowing any substantial reading of its text." Nor were licensing programs for telephone ringtones on point, since "the snippet function does not provide searchers with any meaningful experience of the expressive content of the book," as opposed to ringtones, which "are selected precisely because they play the most famous, beloved passages of the particular piece."

In addition, the court rejected plaintiffs' argument that "Google's storage of its digitized copies of Plaintiffs' books exposes them to the risk that hackers might gain access and make the books widely available, thus destroying the value of their copyrights." Recognizing that "this claim has a reasonable theoretical basis," the court held that it was "not supported by the evidence." Google used the same secure servers it uses for its own corporate information, and plaintiffs failed to establish that these servers were not sufficiently secure against inadvertent or malicious release of files.

Finally, the court rejected plaintiffs' argument of contributory infringement as based on "nothing more than a speculative possibility" that a participating library may, in breach of its agreement with Google, use its digital copy in an infringing manner, or may fail to "maintain security over its digital copy with the consequence that the book may become freely available as a result of the incursions of hackers.

Conclusion

The court's opinion in Google adds further clarity to the meaning of "transformative" purpose. Judge Leval's insight that fair use favors collection and use of the information in the copyrighted work, as well as the traditional understanding of fair use as protecting transformation of the work itself (e.g., using quotes in a book review) is a major contribution to the law of fair use. The next battleground may be over the extent to which businesses may use fair use to protect the creation of in-house digitized databases of copyrighted works from their libraries or that they otherwise possess, including the use of limited searches of and displayed snippets from such databases.

The case is Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015).

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.