In partnership with

Hey there! 👋

Welcome back to SavvyMonk, your one-stop for AI and tech news that actually matters.

A team of researchers just published a paper that could change the outcome of every major AI copyright lawsuit currently working through the courts. It's called Alignment Whack-a-Mole, and its findings are as damaging as the name suggests.

Let's get into it.

The Tech newsletter for Engineers who want to stay ahead

Tech moves fast, but you're still playing catch-up?

That's exactly why 200K+ engineers working at Google, Meta, and Apple read The Code twice a week.

Here's what you get:

  • Curated tech news that shapes your career - Filtered from thousands of sources so you know what's coming 6 months early.

  • Practical resources you can use immediately - Real tutorials and tools that solve actual engineering problems.

  • Research papers and insights decoded - We break down complex tech so you understand what matters.

All delivered twice a week in just 2 short emails.

TODAY'S DEEP DIVE

For the past two years, companies like OpenAI and Google have been fighting a wave of copyright lawsuits from authors, news publishers, and music rights holders. The numbers are striking. A single consolidated case against OpenAI in New York bundles together 16 separate lawsuits. Across all U.S. federal courts, over 70 such cases are pending right now.

And yet no AI company has lost a major fair use ruling. Not one.

The reason comes down to a single legal argument they've all been making. Yes, they admit, copyrighted books and articles were used during training. But the models don't store that content, they say. The text was used to teach the model patterns, like grammar, reasoning, and style, and then it was gone. On top of that, they point to safety systems built to stop the model from ever reproducing protected text word-for-word.

One of those systems is called RLHF (Reinforcement Learning from Human Feedback). Think of it like training a dog. Human reviewers rate thousands of model responses, and the model is retrained to produce more of what scored well and less of what didn't. Over time, it learns to refuse certain outputs entirely. Reproducing a copyrighted book is one of them.

Courts have found this argument plausible enough to keep the cases in limbo. No decisive ruling. No clear winner.

A paper published in March 2026 argues that the argument is technically false.

What the Researchers Actually Did

The paper, titled Alignment Whack-a-Mole, comes from researchers at Stony Brook University, Carnegie Mellon University, and Columbia Law School. That last affiliation matters. Jane C. Ginsburg, one of the co-authors, is a leading copyright law scholar. This wasn't built to be a pure machine learning paper. It was built to have legal consequences.

Jane Ginsburg, American lawyer | Samuel Sánchez

The experiment itself is almost boring in its simplicity. The team took three production-grade AI models, GPT-4o, Gemini 2.5 Pro, and DeepSeek-V3.1, and fine-tuned them on a single task: expanding plot summaries into full prose text. This is the kind of task a writing assistant or a story-drafting tool might do every day. It isn't a jailbreak. It isn't a clever adversarial attack. It's a commercially ordinary operation.

After fine-tuning, they prompted the models using only semantic descriptions of book scenes. No actual book text was included in the prompts. And yet, across all three models, the output was verbatim copyrighted text at a scale that would be hard to explain away. The fine-tuned models reproduced between 85 and 90 percent of held-out copyrighted books by word coverage, in some cases producing single continuous passages exceeding 460 words, word for word.

The models weren't learning the text during fine-tuning. The text was already in the weights from pretraining. The fine-tuning task simply turned off the filter that had been hiding it.

The Murakami Finding Is the One That Stings

The cross-author generalization result is where the paper gets genuinely unsettling. The researchers fine-tuned GPT-4o exclusively on the novels of Haruki Murakami.

Kafka on the Shore - a novel by Haruki Murakami

They then tested whether the model would reproduce text from completely unrelated authors in different genres. It did. Fine-tuning on Murakami's work unlocked verbatim recall of copyrighted books from over 30 other authors.

This rules out the argument that a model reproduces an author's text because it was specifically fine-tuned on that author's content. The text from those 30-plus writers was already embedded in the model's weights from the original pretraining. The fine-tuning on Murakami simply lowered the threshold that was suppressing it.

The researchers also tested fine-tuning on Virginia Woolf novels that are in the public domain. The cross-author leakage appeared there too. When they fine-tuned on purely synthetic AI-generated text, the effect nearly disappeared. The conclusion is hard to contest: fine-tuning on any real author's work, even a public domain one, reactivates memorized content from pretraining across the board.

All Three Models Memorized the Same Books in the Same Places

Perhaps the most legally significant finding in the paper is this one. The researchers measured which portions of each book each model memorized, and then compared the results across all three providers.

GPT-4o, Gemini 2.5 Pro, and DeepSeek-V3.1 memorized the same books in the same regions, with a correlation of r ≥ 0.90. That's not a coincidence of any particular company's training choices. It points to an industry-wide pattern, almost certainly reflecting the shared use of the same pirated book datasets during pretraining.

The paper's authors note that nearly every frontier LLM is believed to have been trained on content from LibGen, a well-known pirated book repository that has been cited in multiple lawsuits. The 90 percent overlap in memorization across three different companies' models, trained independently, is consistent with all three models having seen the same underlying data.

Why This Paper Is Different From Prior Research

Researchers have demonstrated AI memorization before. But prior work mostly focused on extracting memorized content from baseline, unaligned models, or on using aggressive, clearly adversarial techniques. The legal response from AI companies has always been: yes, a raw model might regurgitate training data, but our aligned, production-grade systems have safeguards in place to prevent that.

The Alignment Whack-a-Mole paper directly targets that rebuttal. It tests the aligned, production-grade versions of GPT-4o, Gemini 2.5 Pro, and DeepSeek-V3.1. The fine-tuning task used is one that legitimate businesses build products around today. The researchers frame the core conclusion plainly: alignment is not preventing memorization. It's managing when memorization becomes visible. And fine-tuning changes that visibility threshold in a way that any developer with API access could replicate.

The paper's title explains itself. You can swat verbatim text reproduction at the surface with RLHF and system prompts, but the memorized text stays underneath. Fine-tune the model on a benign task and you move the mole to a different hole.

The Bottom Line

AI companies have built their entire legal defense on the claim that alignment and safety guardrails are a reliable barrier between what their models learned and what they can output. This paper is a peer-reviewed, technically rigorous demonstration that the barrier isn't reliable.

It was written with a copyright scholar as a co-author. It is sitting in courts' peripheral vision at exactly the moment when major fair use rulings are expected in 2026. The researchers themselves put it plainly: as long as copyrighted works remain in pretraining data and models remain fine-tunable, the pathway from memorization to extraction stays open. That's not a problem that better output filters can solve.

AI PROMPT OF THE DAY

Category: Legal Research and Analysis

"I'm trying to understand how a recent technical paper might affect an ongoing legal case. The paper is [Paper Title] and the case involves [Company] being sued for [Claim]. Summarize the key technical findings of the paper in plain English, then explain how a plaintiff's lawyer might use those findings to challenge the defendant's existing legal defense. Keep it accessible to a non-lawyer."

ONE LAST THING

The Alignment Whack-a-Mole paper is still a preprint under review. It hasn't been through formal peer review yet, and AI companies will contest its framing hard. But the technical demonstration is concrete, reproducible, and built with legal consequences in mind. The more interesting question isn't whether this changes the law immediately. It's whether the lawyers already in discovery on 70-plus copyright cases are reading it this week. Hit reply, I read every response.

See you in the next one.

— Vivek

P.S. If you know someone following the AI copyright battles, they'll want to read this one. They can subscribe at https://savvymonk.beehiiv.com/

Keep Reading