According to Reuters Legal in August 2025 Perplexity AI failed to convince a judge to dismiss a lawsuit over its alleged misuse of articles to train its AI...
Caveat: Generally, in libraries, our focus re: AI is on developing a basic understanding of it and recognizing ethical dilemmas with its usage, how tools and technologies align or conflict with our core values and adopting responsible practices. Other goals for library and information professionals include: a) developing AI literacy skills; b) helping users understand AI and potential ethical challenges that arise; and, c) promoting responsible, ethical uses (and even non-usage) of AI. For some, copyright infringement precludes usage of any tools produced by generative AI companies. The issues should be debated and ultimately respected on both sides.
The aim here is to assist library and information professionals in understanding the basic issues of AI copyright infringement and/or legal actions. Check with your scholarly communications and/or copyright librarian for more contextual, accurate information.
AI copyright lawsuits primarily involve claims of infringement from using copyrighted works for AI training data and allegations that AI-generated content infringes on existing copyrights. Landmark cases include the Authors Guild vs. OpenAI, New York Times vs. Microsoft and OpenAI, RIAA vs. Suno AI, and the 2025 Anthropic settlement re: $1.5 billion to authors for pirating books to train its AI. The issue is whether AI's use of copyrighted material falls under fair use or fair dealing (Canada), a doctrine allowing unauthorized use of copyrighted works under certain conditions.
Speaking of Canada, the Copyright Act is intended to be technologically neutral but Canadian courts will need to interpret and apply copyright law in relation to the training and use of AI products. Millions of copyrighted works are used to train generative AI tools, so there are copyright implications for users and academic librarians. Still, some academics using AI feel it enhances teaching, learning, and research, and believe it will make education more accessible for learners with diverse needs (UNESCO, 2024). The truth is that AI brings both benefits and risks in academia, so it is important to consider them critically (UNESCO, 2024) especially copyright infringement and intellectual property issues.
Government Consultations and Future Reforms in Canada
2025 "What We Heard" Report (Feb. 2025): AI industry wants flexibility; creators demand opt-outs and royalties; no proposals yet. Roundtables continue.
Broader Context: Ties into Canada's AI strategy ($2.4B investment in 2024) and Voluntary Code of Conduct (2023). Bill C-27 (AIDA) regulates AI risks but not copyright directly. Reforms may align with U.S./EU models, potentially via 2026 amendments.
2021 Consultation: Deemed AI issues "premature"; focused on general framework.
Copyright Developments Related to AI
Recent court decisions in related to using copyrighted materials in AI training data are changing the way we view fair dealing in Canada, and fair use in the United States. LLM training using copyrighted texts is now seen by at least one court as "fair" use - a major gain for AI developers. However, other courts disagree, and pirated sources to train the AI models are problematic. Yet, there is no final precedent stating all AI training is or is not fair and legal viewpoints are still evolving. In May 2025, the US Copyright Office said, “These groundbreaking technologies should benefit both the innovators who design them and the creators whose content fuels them, as well as the general public.”
Librarians are asking AI developers about their use of copyrighted materials to train their models. Further, what AI-generated works get copyright protection? Several copyright cases have been launched against AI companies. The Canadian and US governments have received a lot of pressure from the public and industries to address the myriad of legal and ethical issues related to AI. Many groups including the Association of Research Libraries advocate for the benefits of artificial intelligence for promoting creativity and innovations. Brigitte Vézina (2023) from Creative Commons said “copyright is just one lens through which to consider AI.”
Fourteen publishers are suing Cohere (Nov 18th, 2025) for allegedly stealing their work to train AI that produced fake articles. Cohere says the lawsuit is “frivolous," but whether or not the case moves forward, it underscores how urgently Canada needs real AI oversight.
The consortium of 14 publishing companies including The Atlantic, Condé Nast, and Forbes filed suit for copyright infringement, accusing Cohere of using their content for training without permission, as well as displaying considerable portions or entire articles of their content without permission.
Perplexity hit by copyright suit. Encyclopedia Britannica and Merriam-Webster are suing Perplexity AI, claiming it copied their material without permission for its "answer engine," which performs searches for users and summarizes its findings. The lawsuit further alleges that Perplexity hurt the reference companies' revenue by redirecting users to those summaries, rather than to their content. Perplexity — which just reportedly completed a funding round that raised its value to $20 billion — is facing a similar suit from Dow Jones and the New York Post.
... settlement of the (Anthropic) class action lawsuit has significant implications for librarians, in the context of copyright and using AI in handling copyrighted materials. US authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson alleged that Anthropic illegally downloaded seven million books from pirate websites such as LibGen and PiLiMi to train its AI chatbot, Claude, infringing on authors’ copyrights. The case raises critical issues for librarians.
...While AI offers transformative potential for library services, the risks of copyright infringement and financial liability are significant. Librarians must consider the broader implications of AI companies’ practices, which could reshape the publishing industry and affect library budgets for digital content. Librarians should consult legal experts when implementing AI tools that process copyrighted materials. They should ensure that AI training data used in library systems is sourced legally through partnerships with publishers or licensed databases, and to avoid retaining unauthorized copies of works.
On August 7, 2025, LCA, with the Authors Alliance, Electronic Frontier Foundation, and Public Knowledge, filed a motion to be allowed to appeal the class certification order issued in the lawsuit it is defending over the use of books for AI training.
In June 2025, a US court ruled in favour of Anthropic in a major copyright case, deciding that training AI models constitutes fair use under US law. However, the ruling was more nuanced than a blanket endorsement. While the judge found AI training was fair use, Anthropic faces trial for allegedly using pirated books. The minimum statutory penalty for this type of copyright infringement is $750 per book. With 7 million titles in Anthropic’s pirated library, the company could face billions in potential damages.
In May 2025: rulings favour Meta and other AI companies where training data came from unauthorized piracy websites. US Copyright Office signalled plans to weigh in on broader AI policy. This comes after US administration’s firing of Shira Perlmutter shortly after the Copyright Office published a report on how copyright law, and fair use doctrine, should apply to use of copyrighted works in training generative AI models. Perlmutter has filed a lawsuit arguing her dismissal was unconstitutional and violated separation of powers.
In January 2025, the US Copyright Office’s provide guidance on works containing AI-generated material, and a report of 1,000 works containing some AI-generated material. Highlights how copyright has become the central arena where courts, infrastructure providers, and regulators are negotiating the future of cultural production in the age of generative AI.
2022-2024
In 2023 and 2024, OpenAI faced multiple lawsuits for alleged copyright infringement against authors and media companies whose work was used to train some of OpenAI's products. In November 2023, OpenAI's board removed Sam Altman as CEO, citing a lack of confidence in him, but reinstated him five days later following a reconstruction of the board. Throughout 2024, roughly half of AI safety researchers left OpenAI, citing the company's prominent role in an industry-wide problem.
A November 2022 class action lawsuit against Microsoft, GitHub and OpenAI alleged that GitHub Copilot, an AI-powered code editing tool trained on public GitHub repositories, violated the copyright of the repositories' authors, noting that the tool was able to generate source code which matched its training data verbatim, without providing attribution.
In January 2023 three US artists—Sarah Andersen, Kelly McKernan, and Karla Ortiz—filed a class action copyright infringement lawsuit against Stability AI, Midjourney, and DeviantArt, claiming these companies have infringed the rights of millions of artists by training AI tools on five billion images scraped from the web without the consent of the original artists.
In January 2023, Stability AI was sued by Getty Images for using 12 million images in their training data without purchasing a license. Getty filed another suit against Stability AI in the US in February 2023 alleging copyright infringement for use of Getty's images in the training of Stable Diffusion, and said the model infringes Getty's trademark by generating images with Getty's watermark.
In July 2023, authors Paul Tremblay and Mona Awa] filed a lawsuit in a San Francisco court against OpenAI, alleging that its ChatGPT language model had been trained on their copyrighted books without permission, citing ChatGPT's "very accurate" summaries of their works as evidence.
The Authors Guild, on behalf of 17 authors, filed a copyright infringement complaint against OpenAI in September 2023, claiming "the company illegally copied the copyrighted works of authors" in training ChatGPT.
The New York Times sued Microsoft and OpenAI in December 2023, claiming their engines were trained on NYT articles and fair use claims made by AI companies were invalid since generated information around news stories directly impacts the newspaper's commercial opportunities.
The Recording Industry Association of America (RIAA) and major music labels sued Suno AI and Udio, AI models that take text to create songs with lyrics and backing music, alleging both AI models were trained without consent from labels.
In September 2024, the German courts dismissed a German photographer's lawsuit against the non-profit organization LAIO] for unauthorized reproduction of his copyrighted work while creating a dataset for AI training. The plaintiff has filed an appeal against the decision.
Note: Attorney Jonathan Band, counsel for Library Copyright Alliance, and Cliff Lynch, Executive Director Coalition for Networked Information (CNI). Video is meant to be for informational purposes only. Any claims should be tested for accuracy and verified.
Note: Please use your critical reading skills while reading entries. No warranties, implied or actual, are granted for any health or medical search or AI information obtained while using these pages. Check with your librarian for more contextual, accurate information.