Skip to content
Misar.io

AI Copyright Issues in 2026: Ethics & Best Practices

All articles
Guide

AI Copyright Issues in 2026: Ethics & Best Practices

The definitive 2026 guide to AI and copyright: training data, generative outputs, fair use, TDM exceptions, major lawsuits, and licensing best practices.

Misar Team·Jun 26, 2025·5 min read
Table of Contents

Quick Answer

AI copyright law in 2026 is being shaped by hundreds of pending lawsuits worldwide. Training on copyrighted works without licence is permitted by narrow exceptions (US fair use, EU Text and Data Mining, UK Section 29A) but outputs substantially similar to training works remain infringing.

  • Training is NOT automatically infringement — and it is NOT automatically fair use
  • Generative outputs can infringe if substantially similar to protected works
  • The US Copyright Office holds that purely AI-generated works lack human authorship

Copyright questions in AI span three stages:

  • Input (training data collection and use)
  • Model (can a trained model itself infringe?)
  • Output (is a generated work a derivative?)

Key authorities are the US Copyright Office (Reports on Copyright and AI, Part 1 March 2024, Part 2 January 2025, Part 3 May 2025), the UK IPO, the EU Copyright Directive Articles 3 and 4 (TDM exceptions), and Japan's Article 30-4 of the Copyright Act.

Key Details / Requirements

Major Pending Lawsuits (Selected)

Case

Plaintiffs

Defendants

Filed

Core Issue

New York Times v. OpenAI & Microsoft

NYT

OpenAI, Microsoft

Dec 2023

Training and verbatim memorisation

Andersen v. Stability AI

Artists

Stability AI

2023

Training on artworks

Getty Images v. Stability AI (US + UK)

Getty

Stability AI

2023

Training on Getty library

Authors Guild v. OpenAI

Authors

OpenAI

2023

Novels in training data

Concord Music v. Anthropic

Publishers

Anthropic

2023

Song lyrics

Bartz v. Anthropic

Authors

Anthropic

2024

Books in training (settled September 2025 for USD 1.5B)

Global TDM and Fair-Use Regimes

Jurisdiction

Rule

Opt-Out Allowed?

USA

Fair use (17 USC 107)

N/A

EU

Copyright Directive Art. 3 (research) and Art. 4 (commercial)

Yes for Art. 4 via machine-readable opt-out

UK

Sec 29A CDPA (non-commercial TDM only)

N/A

Japan

Art. 30-4 Copyright Act (non-enjoyment exception)

No

Singapore

Computational Data Analysis (Sec 244 Copyright Act 2021)

No

Real-World Examples / Case Studies

Bartz v. Anthropic (2025) — The first major AI training settlement: USD 1.5 billion class-action settlement over books used in training, though Judge Alsup had ruled earlier that training itself was transformative fair use when done on lawfully acquired copies.

New York Times v. OpenAI (ongoing) — Federal complaint alleges GPT-4 reproduces Times articles verbatim and competes with the Times' own business.

Stability AI (UK) — Getty Images High Court trial concluded in 2025 with a partial win for Getty on trademark grounds.

US Copyright Office Zarya of the Dawn (2023) — Comic authored by Kris Kashtanova; text and arrangement protected, but Midjourney-generated images denied registration.

What This Means for AI Teams

In 2026, AI teams must:

  • License training data whenever practical (Getty, Shutterstock, Reuters have all signed licensing deals)
  • Implement training-data provenance records (per EU AI Act Art. 53(1)(c))
  • Respect robots.txt signals and TDM opt-outs (EU Copyright Directive)
  • Add output filters for memorisation and near-duplicate generation
  • Indemnify customers against third-party copyright claims (as Adobe, Microsoft, Google, OpenAI now do for enterprise customers)

Compliance Checklist

  • Publish a training-data sources document
  • Honour machine-readable opt-outs (robots.txt, TDM Reservation Protocol, C2PA)
  • License copyrighted datasets where feasible
  • Build memorisation tests into evaluation pipelines
  • Offer customer IP indemnification where commercially appropriate
  • For deployers: record prompts and outputs to demonstrate non-infringement
  • Track ongoing cases and US Copyright Office guidance

FAQs

Q: Is AI training automatically fair use?

No — each case is fact-specific. Several US courts have found training on lawfully acquired data transformative, but not all.

Q: Can AI-generated images be copyrighted?

Only if a human author contributes sufficient creative expression. Pure AI output is not protectable under US Copyright Office policy.

Q: What is the TDM opt-out?

EU Copyright Directive Article 4 allows commercial TDM unless rightholders reserve their rights in a machine-readable form.

Q: Does robots.txt count as a TDM opt-out?

Yes — Cloudflare's AI Crawl Control and W3C signals have now standardised this.

Q: Are lyrics protected differently?

Yes — musical compositions have separate copyright from recordings; lyrics are literary works.

Q: What penalties apply?

US: up to USD 150,000 per work for wilful infringement. EU: member-state-specific.

Q: Can I train on public-domain data only?

Yes — but quality and coverage are usually insufficient for state-of-the-art models.

Conclusion

AI copyright is the most unsettled area of AI law. Teams that document provenance, license data, and indemnify customers will weather the lawsuits best.

Audit your training data with Misar AI's copyright provenance toolkit.

ai-copyrightfair-usetdmtraining-dataip-law
Enjoyed this article? Share it with others.

More to Read

View all posts
Guide

How to Train an AI Chatbot on Website Content Safely

Website content is one of the richest sources of information your business has. Every help article, FAQ, service description, and policy page is a direct line to your customers’ most pressing questions—yet most of this d

9 min read
Guide

E-commerce AI Assistants: Use Cases That Actually Drive Revenue

E-commerce is no longer just about transactions—it’s about personalized experiences, instant support, and frictionless journeys. Today’s shoppers expect more than just a website; they want a concierge that understands th

11 min read
Guide

What a Healthcare AI Assistant Needs Before Launch

Healthcare AI isn’t just about algorithms—it’s about trust. Patients, clinicians, and regulators all need to believe that your AI assistant will do more than talk; it will listen, remember, and act responsibly when it ma

12 min read
Guide

Website AI Chat Widgets: What Converts Better Than Generic Bots

Website AI chat widgets have become a staple for SaaS companies looking to engage visitors, answer questions, and drive conversions. Yet, most chat widgets still rely on generic, rule-based bots that frustrate users with

11 min read

Explore Misar AI Products

From AI-powered blogging to privacy-first email and developer tools — see how Misar AI can power your next project.

Stay in the loop

Follow our latest insights on AI, development, and product updates.

Get Updates