A summary of mainstream reporting, plus the facts and perspectives it leaves out. A more honest account of each story.
Back to all stories
Scope and content:  The original finding aid described this as:
Capture Date: 5/7/1976
Photographer: MARTIN BROWN
Keywords: Larsen Scan

Location Building No: 54
Photo: Martin Brown | Public domain | Wikimedia Commons

Scott Turow And Major Publishers Sue Meta Over Alleged Llama AI Copyright Infringement

Scott Turow and major publishers sued Meta on Tuesday, May 5, 2026, in federal court in New York, accusing it of copyright infringement over the company's Llama AI.

The complaint says Meta trained Llama on copyrighted books and journal articles copied from pirate sites LibGen and Anna's Archive, and it alleges Mark Zuckerberg personally authorized the practice. Plaintiffs propose a class covering "all legal or beneficial owners of registered copyrights" for books with ISBNs or articles with DOIs or ISSNs, and they seek statutory damages, a permanent injunction, and orders to destroy infringing training copies. Meta, through public affairs director Nkechi Nneji, said courts have found training AI on copyrighted material can be fair use and that the company will "fight this lawsuit aggressively."

The episode traces back to Meta's February 2023 release of Llama and reporting in August 2023 that the model was trained in part on Books3, a collection of more than 190,000 pirated books. Authors sued Meta in 2023, and unsealed documents in January 2025 showed the company stopped pursuing book licenses to lean on a fair use strategy. A federal judge ruled in June 2025 that training on copyrighted books could qualify as fair use, though related disputes over distribution and other claims have continued.

The complaint names specific works it says were used without permission, including Turow's Presumed Innocent, N.K. Jemisin's The Fifth Season, and other fiction and academic titles, and it seeks broad relief tied to those claims. Turow and Authors Guild CEO Mary Rasenberger called the alleged conduct "the most flagrant copyright breach in history," saying AI's future has been built "with stolen words."

The lawsuit against Meta by Scott Turow and major publishers underscores a growing concern among creators regarding the use of copyrighted materials in AI training. As highlighted by @nyike, Turow's distress reflects a broader sentiment that the unauthorized use of authors' works not only undermines their financial stability but also threatens the integrity of the publishing industry. This sentiment is echoed by @auths_alliance, which emphasizes the implications of the class-action lawsuit for copyright enforcement in the digital age.

The legal landscape surrounding AI and copyright is evolving, with courts previously ruling that training AI models on copyrighted materials can fall under fair use, as seen in the Kadrey et al. v. Meta ruling. However, as @codervibe__ warns, this lawsuit serves as a crucial reminder for AI developers about the legal risks associated with training data. The ongoing tension between technological advancement and copyright compliance raises fundamental questions about the future of content creation and the rights of authors in an increasingly AI-driven world.

AI and Copyright Technology & Antitrust/Regulation AI and Copyright Law Technology and Courts Publishing Industry
Show source details & analysis (2 sources)

📊 Relevant Data

In a similar lawsuit filed in 2023 (Kadrey et al. v. Meta Platforms Inc.), a court ruled in June 2025 that training the Llama model on copyrighted books constitutes fair use, dismissing some claims while allowing others related to distribution to proceed.

An update on AI copyright cases in 2026 — Norton Rose Fulbright

Meta's Llama 3 model was pretrained on over 15 trillion tokens collected from publicly available sources.

Introducing Meta Llama 3: The most capable openly available LLM to date — Meta AI

In Bartz et al. v. Anthropic, a court ruled in June 2025 that training on copyrighted books constitutes fair use but storing pirated copies does not, leading to a $1.5 billion settlement including an estimated $3,000 per work payout.

An update on AI copyright cases in 2026 — Norton Rose Fulbright

The LibGen database contains over 7.5 million pirated books.

The LibGen Data Set - What we are doing — Pan Macmillan

📌 Key Facts

  • The complaint was filed Tuesday, May 5, 2026, in the U.S. District Court for the Southern District of New York.
  • The suit alleges Meta used copyrighted books and journal articles copied from pirate sites LibGen and Anna's Archive to train its Llama language model, and that Mark Zuckerberg personally authorized the practice.
  • The Meta employee quoted in the complaint said, “If we license once single book, we won't be able to lean into the fair use strategy,” and the filing says Meta stopped pursuing licenses in April 2023.
  • The plaintiffs propose a class defined as "all legal or beneficial owners of registered copyrights" covering any book with an ISBN or journal article with a DOI or ISSN, a definition that would greatly expand potential class size.
  • The complaint lists specific allegedly infringed works, including Scott Turow's 'Presumed Innocent,' Douglas Preston's 'Impact,' Peter Brown's 'The Wild Robot,' N.K. Jemisin's 'The Fifth Season,' and Lemony Snicket's 'Who Could That Be at This Hour?,' along with research and academic titles.
  • The plaintiffs seek statutory damages, a permanent injunction barring Meta from further use of the works, and an order to destroy all infringing copies used in training.
  • Meta, through public affairs director Nkechi Nneji, responded that courts have found training AI on copyrighted material can qualify as fair use and said the company will “fight this lawsuit aggressively.”
  • Scott Turow and Authors Guild CEO Mary Rasenberger characterized the alleged conduct as "the most flagrant copyright breach in history," saying AI's future has been built "with stolen words."

📰 Source Timeline (2)

Follow how coverage of this story developed over time

May 05, 2026
10:00 PM
Meta trained its AI on copyrighted work, new lawsuit alleges
https://www.facebook.com/CBSMoneyWatch/
9:13 PM
Scott Turow's latest real-life legal thriller: Suing Meta for copyright infringement
NPR by Chloe Veltman
New information:
  • The NPR article confirms the complaint was filed Tuesday, May 5, 2026, in the U.S. District Court for the Southern District of New York.
  • The suit explicitly alleges Meta used copyrighted books and journal articles copied from pirate sites LibGen and Anna's Archive to train various iterations of its Llama language model, with Mark Zuckerberg's alleged personal authorization.
  • The complaint quotes a Meta employee as saying, "If we license once single book, we won't be able to lean into the fair use strategy," in explaining why Meta allegedly stopped pursuing licenses in April 2023.
  • The plaintiffs frame the potential class as "all legal or beneficial owners of registered copyrights" for any book with an ISBN or journal article with a DOI or ISSN, greatly expanding potential class size.
  • The complaint lists specific allegedly infringed works, including Turow's "Presumed Innocent," Douglas Preston's "Impact," Peter Brown's "The Wild Robot," N.K. Jemisin's "The Fifth Season," and Lemony Snicket's "Who Could That Be at This Hour?," in addition to research and academic titles.
  • The plaintiffs seek statutory damages, a permanent injunction barring Meta from further use of the works, and an order to destroy all infringing copies used in training.
  • Meta, via public affairs director Nkechi Nneji, responds that courts have found training AI on copyrighted material can qualify as fair use and says the company will "fight this lawsuit aggressively."
  • Scott Turow and Authors Guild CEO Mary Rasenberger provide new public characterizations of the case, calling the alleged conduct "the most flagrant copyright breach in history" and saying AI's future is being built "with stolen words."