Jump to content

AI Copyright Lawsuits

From UBC Wiki
According to Reuters Legal in August 2025 Perplexity AI failed to convince a judge to dismiss a lawsuit over its alleged misuse of articles to train its AI...

Compiled by

Updated

See also

Introduction

US Copyright says: The term “author" ... excludes non-humans. However, see this ruling on a graphic novel where selection-arrangement of images IS copyrightable, but not the images (made with genAI).

The aim here is to assist library and information professionals in understanding the basic issues of AI copyright infringement and/or legal actions. Check with your scholarly communications and/or copyright librarian for more contextual, accurate information.

AI copyright lawsuits primarily involve claims of infringement from using copyrighted works for AI training data and allegations that AI-generated content infringes on existing copyrights. Landmark cases include the Authors Guild vs. OpenAI, New York Times vs. Microsoft and OpenAI, RIAA vs. Suno AI, and the 2025 Anthropic settlement re: $1.5 billion to authors for pirating books to train its AI. The issue is whether AI's use of copyrighted material falls under fair use or fair dealing (Canada), a doctrine allowing unauthorized use of copyrighted works under certain conditions.

Speaking of Canada, the Copyright Act is intended to be technologically neutral but Canadian courts will need to interpret and apply copyright law in relation to the training and use of AI products. Millions of copyrighted works are used to train generative AI tools, so there are copyright implications for users and academic librarians. Still, some academics using AI feel it enhances teaching, learning, and research, and believe it will make education more accessible for learners with diverse needs (UNESCO, 2024). The truth is that AI brings both benefits and risks in academia, so it is important to consider them critically (UNESCO, 2024) especially copyright infringement and intellectual property issues.

For a US perspective of litigation, see DAIL – the Database of AI Litigation.

Government Consultations and Future Reforms in Canada

  • 2025 "What We Heard" Report (Feb. 2025): AI industry wants flexibility; creators demand opt-outs and royalties; no proposals yet. Roundtables continue.
  • Broader Context: Ties into Canada's AI strategy ($2.4B investment in 2024) and Voluntary Code of Conduct (2023). Bill C-27 (AIDA) regulates AI risks but not copyright directly. Reforms may align with U.S./EU models, potentially via 2026 amendments.
  • 2023 Consultation (Copyright in the Age of Generative AI): input on TDM, authorship, and liability; Dec. 2023 with ~1,000 responses.
  • 2021 Consultation: Deemed AI issues "premature"; focused on general framework.

Copyright Developments Related to AI

Recent court decisions in related to using copyrighted materials in AI training data are changing the way we view fair dealing in Canada, and fair use in the United States. LLM training using copyrighted texts is now seen by at least one court as "fair" use - a major gain for AI developers. However, other courts disagree, and pirated sources to train the AI models are problematic. Yet, there is no final precedent stating all AI training is or is not fair and legal viewpoints are still evolving. In May 2025, the US Copyright Office said, “These groundbreaking technologies should benefit both the innovators who design them and the creators whose content fuels them, as well as the general public.”

Queens University. Copyright Considerations for the 4 Stages of Generative AI: Training, Input, Analysis, and Output, 2025

Librarians are asking AI developers about their use of copyrighted materials to train their models. Further, what AI-generated works get copyright protection? Several copyright cases have been launched against AI companies. The Canadian and US governments have received a lot of pressure from the public and industries to address the myriad of legal and ethical issues related to AI. Many groups including the Association of Research Libraries advocate for the benefits of artificial intelligence for promoting creativity and innovations. Brigitte Vézina (2023) from Creative Commons said “copyright is just one lens through which to consider AI.”

2025

  • The consortium of 14 publishing companies including The Atlantic, Condé Nast, and Forbes filed suit for copyright infringement, accusing Cohere of using their content for training without permission, as well as displaying considerable portions or entire articles of their content without permission.
  • Perplexity hit by copyright suit. Encyclopedia Britannica and Merriam-Webster are suing Perplexity AI, claiming it copied their material without permission for its "answer engine," which performs searches for users and summarizes its findings. The lawsuit further alleges that Perplexity hurt the reference companies' revenue by redirecting users to those summaries, rather than to their content. Perplexity — which just reportedly completed a funding round that raised its value to $20 billion — is facing a similar suit from Dow Jones and the New York Post.
  • Anthropic has agreed to pay at least $1.5 billion (3K for each book) to a group of authors and publishers in the largest copyright settlement in U.S. history. The settlement marks a turning point in the clash between AI companies and content owners, which could alter how training data is sourced and inspire more similar lawsuits and new licensing deals.
  • ... settlement of the (Anthropic) class action lawsuit has significant implications for librarians, in the context of copyright and using AI in handling copyrighted materials. US authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson alleged that Anthropic illegally downloaded seven million books from pirate websites such as LibGen and PiLiMi to train its AI chatbot, Claude, infringing on authors’ copyrights. The case raises critical issues for librarians.
  • ...While AI offers transformative potential for library services, the risks of copyright infringement and financial liability are significant. Librarians must consider the broader implications of AI companies’ practices, which could reshape the publishing industry and affect library budgets for digital content. Librarians should consult legal experts when implementing AI tools that process copyrighted materials. They should ensure that AI training data used in library systems is sourced legally through partnerships with publishers or licensed databases, and to avoid retaining unauthorized copies of works.
  • On August 7, 2025, LCA, with the Authors Alliance, Electronic Frontier Foundation, and Public Knowledge, filed a motion to be allowed to appeal the class certification order issued in the lawsuit it is defending over the use of books for AI training.
  • In July 2025, Vancouver-based author J.B. MacKinnon filed four proposed class action lawsuits against major technology companies, alleging they used copyrighted works by Canadian writers to train artificial intelligence systems.
  • In June 2025, a US court ruled in favour of Anthropic in a major copyright case, deciding that training AI models constitutes fair use under US law. However, the ruling was more nuanced than a blanket endorsement. While the judge found AI training was fair use, Anthropic faces trial for allegedly using pirated books. The minimum statutory penalty for this type of copyright infringement is $750 per book. With 7 million titles in Anthropic’s pirated library, the company could face billions in potential damages.
  • In May 2025: rulings favour Meta and other AI companies where training data came from unauthorized piracy websites. US Copyright Office signalled plans to weigh in on broader AI policy. This comes after US administration’s firing of Shira Perlmutter shortly after the Copyright Office published a report on how copyright law, and fair use doctrine, should apply to use of copyrighted works in training generative AI models. Perlmutter has filed a lawsuit arguing her dismissal was unconstitutional and violated separation of powers.
  • In January 2025, the US Copyright Office’s provide guidance on works containing AI-generated material, and a report of 1,000 works containing some AI-generated material. Highlights how copyright has become the central arena where courts, infrastructure providers, and regulators are negotiating the future of cultural production in the age of generative AI.

2022-2024

  • In 2023 and 2024, OpenAI faced multiple lawsuits for alleged copyright infringement against authors and media companies whose work was used to train some of OpenAI's products. In November 2023, OpenAI's board removed Sam Altman as CEO, citing a lack of confidence in him, but reinstated him five days later following a reconstruction of the board. Throughout 2024, roughly half of AI safety researchers left OpenAI, citing the company's prominent role in an industry-wide problem.
  • A November 2022 class action lawsuit against Microsoft, GitHub and OpenAI alleged that GitHub Copilot, an AI-powered code editing tool trained on public GitHub repositories, violated the copyright of the repositories' authors, noting that the tool was able to generate source code which matched its training data verbatim, without providing attribution.
  • In January 2023 three US artists—Sarah Andersen, Kelly McKernan, and Karla Ortiz—filed a class action copyright infringement lawsuit against Stability AI, Midjourney, and DeviantArt, claiming these companies have infringed the rights of millions of artists by training AI tools on five billion images scraped from the web without the consent of the original artists.
  • In January 2023, Stability AI was sued by Getty Images for using 12 million images in their training data without purchasing a license. Getty filed another suit against Stability AI in the US in February 2023 alleging copyright infringement for use of Getty's images in the training of Stable Diffusion, and said the model infringes Getty's trademark by generating images with Getty's watermark.
  • In July 2023, authors Paul Tremblay and Mona Awa] filed a lawsuit in a San Francisco court against OpenAI, alleging that its ChatGPT language model had been trained on their copyrighted books without permission, citing ChatGPT's "very accurate" summaries of their works as evidence.
  • The Authors Guild, on behalf of 17 authors, filed a copyright infringement complaint against OpenAI in September 2023, claiming "the company illegally copied the copyrighted works of authors" in training ChatGPT.
  • The New York Times sued Microsoft and OpenAI in December 2023, claiming their engines were trained on NYT articles and fair use claims made by AI companies were invalid since generated information around news stories directly impacts the newspaper's commercial opportunities.
  • The Recording Industry Association of America (RIAA) and major music labels sued Suno AI and Udio, AI models that take text to create songs with lyrics and backing music, alleging both AI models were trained without consent from labels.
  • In September 2024, the German courts dismissed a German photographer's lawsuit against the non-profit organization LAIO] for unauthorized reproduction of his copyrighted work while creating a dataset for AI training. The plaintiff has filed an appeal against the decision.
  • Several Canadian news agencies under News Media Canada sued OpenAI in November 2024 for copyright violations related to the use of their news articles being used to train ChatGPT. They are seeking damages up to $20,000CDN per news article used for training.

Presentation

Note: Attorney Jonathan Band, counsel for Library Copyright Alliance, and Cliff Lynch, Executive Director Coalition for Networked Information (CNI). Video is meant to be for informational purposes only. Any claims should be tested for accuracy and verified.

Copyright Organizations

References

Disclaimer

  • Note: Please use your critical reading skills while reading entries. No warranties, implied or actual, are granted for any health or medical search or AI information obtained while using these pages. Check with your librarian for more contextual, accurate information.