New York Times Sues Perplexity AI for Copyright Infringement and ‘Trademark Tarnishment’


TL;DR

  • The gist: The New York Times has sued Perplexity AI for copyright infringement and trademark dilution, targeting its “answer engine” business model.
  • Key details: The complaint reveals Perplexity spent $48 million on cloud services in 2024 but paid $0 for NYT content, despite a $20 billion valuation.
  • Why it matters: This lawsuit challenges the legality of Retrieval-Augmented Generation (RAG) and seeks to hold AI liable for “hallucinations” that damage brand reputation.
  • Context: The case joins a wave of litigation from publishers like Dow Jones and Reddit, escalating the conflict over fair use and AI search.

The New York Times has sued Perplexity AI, alleging the startup’s “answer engine” illegally copies millions of articles to create a substitute product that siphons traffic and revenue.

Filed in the U.S. District Court for the Southern District of New York, the federal complaint targets the core mechanics of Retrieval-Augmented Generation (RAG). This technique retrieves real-time data to ground AI responses. The Times argues this practice constitutes “massive copyright infringement” at both the input and output stages.

Beyond copyright claims, the lawsuit introduces a novel legal theory: trademark dilution by “tarnishment.” The publisher alleges Perplexity’s AI “hallucinations” (fabricated text falsely attributed to the newspaper) damage its reputation for accuracy.

Promo

The ‘Answer Engine’ on Trial: Attacking the RAG Model

Central to the complaint is the argument that Perplexity’s business model is fundamentally parasitic. The lawsuit challenges the legality of Retrieval-Augmented Generation (RAG), the technology that allows AI to fetch real-time data to answer queries.

The Times argues that Perplexity’s infringement occurs in two distinct phases: the initial scraping of data (Input) and the generation of summaries (Output).

Its complaint outlines a two-pronged theory of copyright violation that targets both how the AI learns and what it produces. At the “input” level, the lawsuit alleges that Perplexity’s web crawlers, specifically identified as ‘PerplexityBot’ and ‘Perplexity-User’, illegally harvest vast amounts of data.

This involves scraping content not just from the Times’ own domain but also from third-party platforms to construct an “AI-First” search index. This index serves as the real-time knowledge base that feeds the Large Language Models (LLMs) whenever a user asks a question.

The infringement allegedly extends to the “output” stage, where the AI generates its responses. The Times asserts that these summaries are often identical or substantially similar to the original reporting, rather than transformative new works.

The filing claims that to power these tools, Perplexity has effectively copied, distributed, and displayed millions of protected works, ranging from written articles to multimedia assets like videos and podcasts, without permission.

This dual-stage theory attempts to close a loophole often exploited by AI companies, which argue that training data is transformative but output is generative. By targeting the retrieval mechanism itself, The Times is attacking the “answer engine” concept directly.

Unlike traditional search engines that drive traffic via links, the complaint alleges Perplexity explicitly markets itself as a replacement for the source. Citing Perplexity’s “Skip the Links” marketing campaign, the filing presents it as evidence of intent to disintermediate publishers.

The lawsuit frames this marketing strategy not as innovation, but as a deliberate attempt to bypass content creators:

“Perplexity provides commercial products to its own users that substitute for The Times, without permission or remuneration and, in fact, over The Times’s express and repeated objections.”

This substitution effect is quantifiable; referral traffic data suggests AI bots send 95.7% less traffic than traditional search engines. Such a business model, the lawsuit contends, destroys the economic incentive for the original journalism it relies upon.

Perplexity CEO Aravind Srinivas has previously described the company’s goal as providing direct answers rather than a list of websites. Defending the model, Srinivas argues that direct answers are a necessary evolution of search technology:

“The principle in Perplexity is you’re not supposed to say anything that you don’t retrieve, which is even more powerful than RAG because RAG just says, ‘Okay, use this additional context and write an answer.’”

Trademark Tarnishment: Liability for AI Hallucinations

In a significant expansion of legal strategy against AI, The Times is suing for trademark dilution by “tarnishment.” The publisher argues that when Perplexity attributes false information to the newspaper, it damages the brand’s reputation for accuracy.

This claim moves beyond intellectual property theft to address the reputational harm caused by generative errors.

The filing details a specific form of reputational harm rooted in the Lanham Act, arguing that Perplexity’s misuse of the Times’s brand goes beyond simple theft. The core of this argument is that the AI engine frequently generates “hallucinations” – fabricated text that never appeared in the newspaper – while simultaneously displaying the Times’s famous trademarks next to the falsehoods.

This false attribution, the publisher claims, tricks users into believing the errors are the result of the Times’s reporting.

Furthermore, the complaint alleges that even when the AI retrieves actual content, it often produces misleadingly incomplete summaries. By displaying these distorted versions alongside the Times’s logo without disclosing the omissions, Perplexity allegedly passes off inferior reproductions as the high-quality journalism associated with the brand.

Specifically, the complaint cites an instance where Perplexity allegedly fabricated a Wirecutter review recommending a product that had been recalled for safety reasons. Such errors are not merely technical glitches but, according to the lawsuit, actionable legal harms.

The lawsuit argues that these fabrications do more than just misinform; they actively deceive the user about the source of the information. By placing the Times’s trademarks alongside AI-generated “hallucinations” or heavily redacted summaries, Perplexity allegedly creates a “false designation of origin.”

The filing contends that this practice tricks readers into believing the newspaper explicitly endorsed, sponsored, or created the content, effectively trading on the Times’s credibility while transferring the blame for the AI’s errors onto the publisher’s reputation.

This legal theory attempts to hold AI companies liable for the quality of their output, not just the provenance of their training data. It leverages the Lanham Act to argue that false attribution is a form of brand damage.

Srinivas has previously claimed that citing sources is a mechanism to reduce such errors. When pressed on the issue of accuracy, he noted, “I suppose you’re saying you want to really stick to the truth that is represented by the human-written text on the internet? Correct.”

However, the lawsuit argues that citations alone do not absolve the platform of liability for generating false content.

The Economics of Extraction: $48M for Cloud, Zero for Content

In a disclosure of rare private financial data, the complaint reveals a significant disparity in Perplexity’s spending priorities. The filing paints a picture of a company with significant operating costs but no content budget, relying entirely on the “free” labor of journalists:

“Indeed, according to a news report, in 2024 Perplexity spent $48M on cloud services, paid $19M for talent, and paid $8M to Anthropic and OpenAI to use their models yet paid The Times nothing for using Times Content to power its products.”

Highlighting this imbalance, the filing notes that while Perplexity pays model providers like Anthropic and OpenAI, it pays nothing to the publishers who provide the factual grounding. This economic structure is central to the lawsuit’s argument that Perplexity is “free-riding” on the investment of newsrooms.

The publisher characterizes this business model as an exploitative extraction of value:

“Perplexity’s latest valuation at $20 billion and success at raising funds of nearly $1.5 billion are indicative of the potentially massive illegal transfer of economic value from original content creators like The Times to Perplexity.”

With a valuation soaring to $20 billion and nearly $1.5 billion in venture capital funding, Perplexity represents a substantial transfer of value from content creators to tech platforms, according to the complaint. The Times contrasts this valuation with the financial reality of journalism, arguing that such appropriation is not innovation but theft.

Technical Warfare: Stealth Crawlers and Hard Blocks

Detailing a technical cat-and-mouse game, the lawsuit describes the struggle between the publisher’s security teams and the AI startup’s bots. The Times implemented a “hard block” of Perplexity’s declared crawler in November 2024, followed by a block of its user agent in July 2025.

Despite these measures, the complaint alleges Perplexity logged over 175,000 access attempts in August 2025 alone. A spokesperson for the newspaper condemned the company’s disregard for explicit access denials:

“While we believe in the ethical and responsible use and development of AI, we firmly object to Perplexity’s unlicensed use of our content to develop and promote their products.”

The filing accuses Perplexity of using “stealth crawling” tactics, such as spoofing Google Chrome user agents and rotating IP addresses to evade detection. These allegations corroborate earlier findings by web security firm Cloudflare, which identified evidence of stealth crawling in mid-2025.

Cloudflare says it has already blocked 416 billion AI bot visits, highlighting the scale of the automated scraping problem facing the open web.

A Pattern of Conflict: From Amazon to Dow Jones

Marking the latest in a series of high-profile legal challenges, this lawsuit signals a coordinated industry crackdown on Perplexity. It follows similar copyright suits filed by News Corp (Dow Jones) and a lawsuit from Reddit, both alleging unauthorized scraping.

Perplexity’s communications team has framed the lawsuit as a historical inevitability for disruptive technologies. Jesse Dwyer, Head of Communications at Perplexity AI, stated:

“Publishers have been suing new tech companies for a hundred years, starting with radio, TV, the internet, social media and now AI. Fortunately, it’s never worked, or we’d all be talking about this by telegraph.”

However, the pressure is mounting from multiple fronts.

Amazon has also threatened legal action, sending a cease-and-desist from Amazon regarding Perplexity’s “Comet” shopping agent. In a notable contrast, while suing Perplexity, The Times has successfully negotiated a deal where it licensed content to Amazon for its AI platforms.

Attempting to mitigate these conflicts, Perplexity launched a publisher revenue-sharing model, but major outlets have largely rejected it in favor of litigation or direct licensing with larger tech giants.

While it secured a partnership with Getty Images, the broader publishing industry remains hostile. The Times previously sued Microsoft and OpenAI in late 2023, establishing itself as the primary litigant in the battle for AI copyright.



Source link

Recent Articles

spot_img

Related Stories