From ancient origins in the ill-fated Library of Alexandria through the Middle Ages and into modern copyright regimes, societies have long sought to preserve and catalog human knowledge and make it publicly accessible. For much of history, however, these goals have been elusive due to the cost of assembling and storing works, the impermanence of paper and ink, and the inherent limitations on access to physical copies.
Google’s bold announcement in December 2004 that it intends to scan, digitize, and make universally searchable the collections of leading libraries brought the timeless aspirations of enlightened societies within reach and marked the beginning of a new era for scholars, authors, and other users of recorded knowledge. For public domain works, users would be able to search, retrieve, and download the full documents. For works still under copyright protection, Google would provide Boolean search capability.
Just a decade ago, the cost and time required to digitize and render searchable even 10 percent of the vast stock of written human knowledge was thought to be prohibitive. Yet Google has committed to making extensive collections of some of the world’s leading libraries available within less than a decade and without any public expenditure.
Within months of this announcement, some publishers and the Authors’ Guild filed copyright infringement class actions against Google, casting a cloud of uncertainty over the “Google Books” initiative. After several years of pre-trial skirmishing, the parties reached a class action settlement of enormous scope in October 2008.
The agreement would authorize Google to continue to scan in-copyright books into its search database and enable users to search the full content of scanned books, subject to an opt-out provision. Google would display up to 20 percent of in-copyright books that are no longer commercially available, subject to various exceptions. The settlement would establish a Books Rights Registry (BRR), analogous to ASCAP and other collective rights organizations, which would manage the division of advertising revenue from Google Books (63 percent going to rightsholders (publishers and authors) and 37 percent remaining with Google).
The settlement also provides for expanded access to in-copyright books that are not commercially available, subject to printing, copying, and digital watermarking restrictions and subscription fees and printing charges for some users and institutions. The settlement would provide additional rights and responsibilities for four categories of partner libraries, but would be non-exclusive, thereby allowing libraries to pursue other digitization projects.
The agreement would dramatically expand effective access to human knowledge – on the order of magnitude of the introduction of the printing press and the World Wide Web. The fears that this project would bring about widespread piracy of copyrighted works have largely abated. Judge Denny Chin, sitting in the Southern District of New York, is scheduled to hold a hearing on the settlement in less than a month’s time. So, it follows that the court should approve the settlement, perhaps subject to some adjustments. Well ... not so fast.
Dozens of interested parties have now submitted comments raising due process (principally representativeness of the class and notice), competition, copyright, international treaty compliance, and privacy concerns about the proposed settlement. While many of the concerns could potentially be ironed out, I believe that the settlement raises much more fundamental questions – questions relating to institutional choice and the larger policy balances underlying copyright law – which caution against judicial resolution of this matter.
The problem is two-fold: (1) No jurist – not even the world’s foremost copyright authority – has the breadth of knowledge, range of perspective, and policy-making authority to set the ground rules for one of the most important knowledge distribution platforms in history; and (2) the default structure for the settlement – the Copyright Act – was crafted without any recognition of the technological advances that make Google Books possible. Congress should reexamine and amend that framework before that market and economic interests become entrenched.
My advice to Judge Chin is to declare a “cooling off” period (initially a year) to allow Congress, the Library of Congress, and the Executive Branch to undertake systematic study and review of how copyright law should be adapted to address such a fundamental advance in the technology for providing access to knowledge. The government should immediately assemble a study panel that is broadly representative of the interested parties and the public at large.
The Commission on New Technological Uses of Copyrighted Works (CONTU) provides a useful model. As Congress was completing substantial revision of the Copyright Act in the mid-1970s, the challenges of computer software and photocopying technologies emerged. CONTU was created to provide the president and Congress with recommendations concerning those changes in copyright law “to assure public access to copyrighted works used in conjunction with computer and machine duplication systems and to respect the rights of owners of copyrights in such works, while considering the concerns of the general public and the consumer.”
An analogous commission is needed to confront and prepare recommendations for digital preservation of and online access to published works. The proposed Google Books settlement provides some valuable potential ingredients for a new regime, but it works around rather than confronts several weaknesses of copyright law. It also privatizes some essential functions that might better be served by governmental entities. Congress should take this opportunity to address the orphan-works problem as well as to reinvigorate copyright law’s larger purposes, which include collecting, preserving, cataloging, and making accessible human knowledge in addition to the granting of exclusive rights.
As I suggested in an article before the Google Books settlement was reached, Congress should confront the potential opportunities and risks of digital technology preemptively and directly to strike the appropriate balance between protection of works of authorship on the one hand and accessibility and preservation on the other. See Peter S. Menell, Knowledge Access and Preservation Policy in the Digital Age, 44 Hous. L. Rev. 1013 (2007) <available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=999801>.
By focusing on the economic, social, and cultural benefits of building a comprehensive publicly searchable database of literary and artistic works, Congress can effectuate the overarching purposes of “promoting progress” and preserving human knowledge without sacrificing the beneficial economic incentives afforded by copyright law. A carefully crafted safe harbor with safeguards to prevent piracy of in-copyright works would fuel markets for copyrighted works (and search within those works) while making accessible the vast stock of knowledge to current scholars and authors. It would also preserve the largest possible record for future generations.
A “cooling off” period risks delaying some of the benefits of the Google Books project and the proposed settlement. But it should be recognized that resolving this issue in the courts could easily take years of appeals. Furthermore, there is already substantial progress on developing orphan-works legislation.
More fundamentally, we need to be mindful that decisions of this magnitude should be made through democratic institutions, not private settlements purporting to represent broad classes. Moreover, the impacts of the resolution of this controversy will be felt for decades or centuries.
This is not to say that Google does not deserve tremendous praise for its ingenuity and public spiritedness in pursuing this project. Google has undoubtedly hastened the day when the vast stock of the most valuable human knowledge is broadly accessible. Google’s efforts deserve commendation, but to allow Google effective monopoly control over the world’s most comprehensive knowledge repository would be out of proportion to those efforts. Such control would pose undue threats to long-term competition, innovation, and preservation of and access to knowledge.