Introduction
In recent legal developments, The New York Times has taken decisive action against OpenAI and Microsoft, filing a complaint in the Southern District of New York on December 27, 2023. The crux of the matter revolves around the alleged utilization of the Times’s copyrighted works in the development of generative artificial intelligence (AI) products, specifically Microsoft’s Copilot (formerly Bing Chat) and OpenAI’s ChatGPT. This groundbreaking case, encapsulated in The New York Times Co. v. Microsoft Corp. et al., Case No. 1:23-cv-11195 prompts a pivotal question:
When a tech company utilizes copyrighted material, including news articles, investigations, and opinion pieces, to train an AI chatbot capable of engaging in conversations on a myriad of topics, does it constitute plagiarism or an infringement of intellectual property rights?
Case background
Issues explained
First and foremost, echoing recent AI copyright lawsuits, The Times contends that its rights were violated through the “scraping” of its articles.
This involves the digital scanning and replication of content, which was subsequently included in the extensive datasets used to train GPT-4 and other AI models—a practice commonly referred to as the “input” side of the alleged infringement.
Secondly, The Times’s lawsuit points to instances where OpenAI’s GPT-4 language model, versions of which power ChatGPT and Bing, seemingly produced detailed summaries of paywalled articles, such as Wirecutter product reviews, or even entire sections of specific Times articles.In essence, The Times argues that the alleged copyright violation extends beyond the input phase to the “output” generated by these AI tools.
Navigating IPR dilema
- Section 106 Exclusive Rights: Section 106 of the Copyright Act of 1976 bestows copyright owners with exclusive rights, encompassing the reproduction, creation of derivative works, and distribution of their copyrighted material. Additionally, it includes the rights to publicly perform and display works, with specific provisions for various types of creative works and sound recordings. The New York Times, for instance, holds exclusive rights as granted by Section 106, covering reproduction, derivative works, distribution, public performance, and public display of their copyrighted material.
- Section 501(a) Copyright Infringement Definition: Section 501(a) of the Copyright Act of 1976 defines an infringer as an individual who violates the exclusive rights of a copyright owner or author, as outlined in Section 106. OpenAI and Microsoft, having utilized The New York Times’ material without permission, could potentially be considered infringers under Section 501(a), thereby violating the exclusive rights granted by Section 106.
- Section 506(a) Criminal Infringement Criteria: Section 506(a) of the Copyright Act of 1976 outlines criteria for criminal infringement, requiring the demonstration of a valid copyright, wilful infringement, and infringement for commercial advantage or private financial gain, with the infringer’s knowledge or awareness of its commercial intent. The actions of OpenAI and Microsoft may be deemed criminal infringement under Section 506(a) if the government can establish the presence of a valid copyright (Section 106), wilful infringement, and an intent for commercial advantage or private financial gain.
- Section 107 Fair Use Provision: Section 107 of the Copyright Act of 1976 provides for fair use of copyrighted works for purposes such as criticism, comment, news reporting, teaching, scholarship, or research, exempting such use from copyright infringement. In the case of OpenAI and Microsoft, their defense may hinge on Section 107’s fair use provision, asserting that their use falls under fair use as it involves training AI models for innovative purposes, aligning potentially with research and development. However, the determination of fair use involves a careful consideration of factors such as the purpose and character of the use, the nature of the work, the amount used, and the impact on the market.
Fair use defence
- criticism
- comment
- news reporting
- teaching
- scholarship or
- research.
The determination of fair use involves a careful consideration of four statutory factors. Recent copyright case law has placed increased emphasis on the transformative nature of the secondary use under the first factor, with many courts recognizing technological innovations as sufficiently transformative to warrant fair use protection.
- the purpose and character of use
- the nature of the copyrighted work
- the amount or substantiality of the portion used and
- the effect of use on the potential market
The latter three factors appear to favour The Times in this case, as The Times argues that its works are highly creative, OpenAI uses the entirety of The Times’s works, and there is a claimed impact on revenue. However, the pivotal point revolves around the first factor — the purpose and character of OpenAI’s use — and whether it is deemed “transformative.”
Referencing the landmark Feist Publications case, the U.S. Supreme Court held that information devoid of a minimum level of original creativity is not eligible for copyright protection. In essence, copyrights safeguard creativity, not the process before or after creation. OpenAI’s argument centers on the new and distinct purpose of using articles for training and developing a language model, contrasting it with the simple act of reading or subscribing to news.
Way Ahead
Author:
Kosha Doshi, Final Year Student at Symbiosis Law School, Pune and Legal Intern Data Privacy and Digital Law at EU Digital Partners.
Kosha is also a co-author of “Facial Recognition at CrossRoads: Policy Perspectives on Disruption and Innovation,” at the Closing the Gap 2023 | Emerging and Disruptive Technologies: Regional Perspectives Conference in the Hague, Netherlands.