Prof. Jane C. Ginsburg, Columbia University School of Law*
May 30, 2016

Photographers, graphic designers, and photo agencies have expressed dismay over the systematic stripping of copyright management information (CMI), including author identification, from images uploaded to social media platforms.  Sometimes downstream users remove author identification, often unwittingly, as detailed in this recent blogpost by Denis Nazarov, https://blog.mediachain.io/the-gif-that-fell-to-earth-eae706c72f1f#.nkqaas1qe, showing how a graphic design initially posted with full author-identifying information went viral, but in the process lost all trace of the originating author.  Metadata-stripping not only deprives authors of the credit that can lead to fame and (often modest) fortune; by eliminating copyright information it complicates users’ efforts to clear rights in images, and can lead to the works’  “orphanage.”

The role of social media platforms not only as hosts of CMI-removed copies, but also in themselves removing authorship-identifying information (and CMI more generally) and making available data-stripped versions of works of authorship, especially photographs, deserves particular consideration.  For digital photographs, CMI metadata embedded in the files identifies, among other things, ownership, copyright, and contact information, and information about the contents of the photo. Some metadata is embedded automatically upon the creation of a digital photo, and metadata can also be added in the post-production process, for example, when a photographer uploads to an image site, such as Getty Images.1  The International Press Telecommunications Council (IPTC) has conducted studies over the last four years assessing the extent to which various websites remove or modify photo files’ metadata.  IPTC metadata can include a wide range of information about the photo’s creation, including: creator, creator’s job title, contact information (address, phone number, e-mail address, and website), date created, credit line, instructions, source, copyright notice, and rights usage terms (among others).2

The IPTC study assessed various websites by uploading a photo with metadata and then ascertaining (among other things): whether the embedded metadata fields were shown by the web user interface; if so, whether the data displayed included the most relevant metadata fields (the “4 C’s”: caption, creator, copyright notice, and credit line); and whether a downloaded image through the website’s user interface (such as a download button) included the same information.  The websites tested included Facebook, Instagram, Flickr, Tumblr, Twitter, Pinterest, LinkedIn, Google Photo, Behance.net, and others.  Of the sites tested, only Behance.net included and displayed all of the rights-relevant fields and preserved that information for saved or downloaded images.  Several sites did not display metadata at all, and none but Behance displayed the “4 C’s.”3

If the social media platforms are themselves stripping metadata when users post the images, or if the programs they make available to other users to download the images remove the data, would the platforms be violating Section 1202 of the U.S. Copyright Act, on the protection of Copyright Management Information?  Assuming the metadata qualify as CMI, do the platforms’ acts intentionally remove CMI, having reasonable grounds to know that the removal will induce, enable, facilitate, or conceal copyright infringement?  Caselaw under Section 1202 indicates that actual or constructive knowledge of facilitation (etc.) may be inferred when the person or entity removing CMI invites or expects downstream recommunication of the work.4

The next question would be whether the platform’s CMI-removal is “intentional.”  It should not matter that the removal is automated and indiscriminate; setting the default to eliminate embedded metadata, assuming this is a desired result and not merely an unanticipated by-product of some other function, represents a choice by the platform.  Overbreadth of information-removal is not an unanticipated by-product.  Suppose the platform chooses to remove metadata in order to reduce file size, and thus speed up the communication of the content.  The metadata may include not only authorship and copyright information, but also non-CMI categories of information such as: camera, GPS location, exposure time, ISO, aperture value, brightness value, shutter speed value, light source, scene capture type, flash, and white balance.  Or, in order to protect user privacy, suppose the platform removes metadata regarding location information, such as the GPS coordinates of a house, school, or place of work depicted in the photo.  The presence of lawfully removable non-CMI data such as the elements posited above should not entitle the platform or website to bootstrap the author-identifying information.  Platforms can in fact design their systems to eliminate lawfully removable non-CMI data but keep the author-identifying and other CMI information.  The platform’s intent need not be manifested as to individual works; it can also be exercised through systems design.

Where the platform does not remove the data from copies residing on its website, but it makes available to its users download programs that strip the data from the downloaded content, one may initially ask whether the person or entity removing the data is the platform or the user.  Does the user “make” the copy and remove the data in the process, or does the platform, as part of its distribution of the copy, remove the data?5  The user may not know, much less intend, that her downloaded copy has been deprived of CMI.  The platform, however, through its systems design choices, has effectively imposed CMI-removal, and might be directly or contributorily liable for Section 1202 violations.6

But would the platform nonetheless avoid Section 1202 liability on the ground that, as a host service provider, it enjoys immunity under Section 512(c) of the copyright act?  At first blush, Section 512(c) would not apply, because a Section 1202 violation is not quite the same thing as an “infringement of copyright” from which Section 512 relieves service providers of liability.7   Section 501 defines “an infringer of copyright” as “Anyone who violates any of the exclusive rights of the copyright owner as provided by sections 106 through 122 or of the author as provided in section 106A(a), or who imports copies or phonorecords into the United States in violation of section 602.”8  Section 1203 sets out civil remedies for “a violation of section 1201 or 1202”; while we have seen that Section 1202 violations are linked to copyright infringement, in that the knowing removal or alteration of CMI must also be done with actual or constructive knowledge that it will facilitate infringement, the prohibited conduct is not itself infringing, nor does it require that infringement in fact have occurred.  Under this reading, then, a host service provider finds no shelter under Section 512 for direct or contributory violation of Section 1202.

Stretching Section 512 to cover infringement-related conduct addressed in Section 1202, the next question would be whether the platform meets the threshold requirements set out in Section 512(i) to qualify for the immunity.  The provision makes “accommodation of technology” a “condition of eligibility” and states that “the service provider must accommodate and not interfere with ‘standard technical measures.’”9  It defines “standard technical measures”” as

technical measures that: (1) are “used by copyright owners to identify or protect copyrighted works”; (2) “have been developed pursuant to a broad consensus of copyright owners and service providers in an open, fair, voluntary, multi-industry standards process”; (3) “are available to any person on reasonable and nondiscriminatory terms”; and (4) “do not impose substantial costs on service providers or substantial burdens on their systems or networks.”

If metadata such as IPTC information fits the statutory criteria, then platforms that remove it are not accommodating “standard technical measures” but are instead “interfering with” them, and therefore would be disqualified from claiming safe harbor protection under Section 512(c).  As for whether metadata regarding copyright information does constitute a “standard technical measure,” the Southern District of California in Gardner v. Cafepress Inc.10 found that that summary judgment could not be granted to the defendant with respect to the second element (plaintiff’s metadata appeared consistent with the other statutory elements, and defendant did not seek summary judgment on that ground):

at a minimum, Plaintiff has offered sufficient evidence to create a dispute of material fact as to whether CafePress’s deletion of metadata when a photo is uploaded constitutes the failure to accommodate and/or interference with “standard technical measures.”  From a logical perspective, metadata appears to be an easy and economical way to attach copyright information to an image.  Thus, a sub-issue is whether this use of metadata has been “developed pursuant to a broad consensus of copyright owners and service providers.”  Accordingly, the Court cannot conclude, as a matter of law, that CafePress has satisfied the prerequisites of § 512(i).11

To date, there appears to be no further judicial assessment of whether author-identifying metadata constitutes a standard technical measure.  But the statutory language does not encourage sanguine expectations.  Because the participation of service providers in the development of the standard could disqualify them from immunity were the service providers to fail to accommodate the technical measure, service providers have every incentive to abstain from participation.  Their abstention defeats the development of a standard that meets statutory requirements, and therefore leaves non-accommodating service providers’ statutory shelter undisturbed.

If CMI metadata is not yet a standard technical measure, then the metadata-removing platforms may qualify to invoke the safe harbor of Section 512(c), but they next must demonstrate that their activities are consonant with those the statute immunizes.  The principal issue would be whether metadata-stripping comes within the scope of “infringement of copyright by reason of the storage at the direction of a user of material that resides on a system or network controlled or operated by or for the service provider.”12  Data-stripping is not “storage”; it alters – at the instance of the host – the file the user directed to be stored on the host’s server.  Courts have interpreted “by reason of the storage” to encompass a broad range of activities additional to mere storage, for example reasoning that the immunity must also cover the communication of the stored material at the request of other users, otherwise the safe harbor would be ineffective.13

More broadly still, the Ninth Circuit has indicated that “a service provider may be exempt from infringement liability for activities that it otherwise could not have undertaken ‘but for’ the storage of the infringing material at the direction of one of its users.”14  Had the users not uploaded the files to the platform, the service provider could not have removed their metadata.  But such a “but for” construction risks bootstrapping a good deal of conduct well in excess of the storage and communication of the user-posted content.  As the Gardner court observed, “This interpretation does not, however, give a service provider free rein to undertake directly infringing activities merely because it allows users to upload content at will.”15  By the same token, removal of CMI metadata, albeit automated, and perhaps undertaken to enhance the communication speed of the user-posted files or to protect user privacy, nonetheless is an activity the host engages in at its own initiative, that is independent of the user’s “direction” to store and make available the posted content, and that initiative may in turn violate Section 1202.

Thus, if author-identifying and other copyright-relevant metadata constitutes statutorily protected CMI, and the platforms intentionally remove or alter it, having reasonable grounds to know that these acts will facilitate infringement by downstream users, then the platforms may be liable under Section 1202, and they will not enjoy immunity under Section 512(c), either because that provision does not apply to violations that are not “infringement of copyright,” or because metadata-stripping exceeds the immunity accorded for storage and recommunication of user-posted content.