Cursor Unveils Composer 2 AI Model Beating GPT-4.5 and Claude in Efficiency

TL;DR

Launch: Cursor launched Composer 2, a code-only model built for multi-file edits, refactoring, and long-running coding tasks inside its editor.
Benchmarks: Cursor says Composer 2 posted competitive coding benchmark scores against Claude Opus 4.6, though GPT-5.4 still led some tests.
Pricing: Cursor priced Composer 2 below rival models, positioning near-frontier coding performance as a cheaper option for heavy developer workloads.
Strategy: The launch helps Cursor reduce dependence on OpenAI and Anthropic while protecting margins for a business with over 1 million daily users.

Cursor’s earlier Composer effort hinted that the company wanted tighter control of its AI stack. This week it pushed further with Composer 2, a code-only model that Cursor says can challenge OpenAI and Anthropic on coding tasks at a fraction of the cost. The release puts Cursor’s own Composer 2 model at the center of its editor strategy.

Just as notably, Cursor is leaning into specialization. Cofounder and research lead Aman Sanger says Composer 2 “won’t help you do your taxes” and “won’t be able to write poems.” In other words, Cursor is betting focus, not breadth, will help it compete more directly with larger rivals.

How Composer 2 Is Different

Cursor Composer 2 is now available in Cursor. The model is designed for multi-file edits, code generation, refactoring, and long task chains that can run across hundreds of actions. Rather than launch another broad frontier model, Cursor is trying to deepen the AI assistance developers already use inside its editor.

That product framing carries into the training story. The model was trained only on code data. It offers a 200,000-token context window and is optimized to handle CLI coding tasks.

That matters because Composer 2 looks less like a standalone platform play and more like a deeply integrated product layer. Cursor’s earlier web app expansion already showed it was moving beyond a classic IDE surface. Composer 2 extends that push by giving the company a model it can tune for its own environment instead of only routing users to OpenAI and Anthropic systems.

Training for Long-Horizon Coding Work

Under the hood, Cursor says Composer 2 quality gains come from stronger continued pretraining followed by reinforcement learning on long-horizon coding work. The goal is to keep the model effective on larger jobs that span files, tools and repeated edits. That is where many coding assistants still lose context or drift.

Composer 2 uses a training technique Cursor calls self-summarization, Cursor wrote: “We trained Composer for long-horizon tasks through a reinforcement learning process called self-summarization,” adding that the method lets it learn from trajectories longer than the model’s maximum context window.

In practice, the technique is meant to ease the usual context bottleneck. Cursor uses self-summarization when training runs exceed context limits. The model can use working context summaries and cuts compaction errors by 50% versus older methods.

Benchmarks and Pricing Set Up the Challenge

According to Cursor’s own CursorBench benchmarks, Composer 2 scored 61.3 on CursorBench, up from 44.2 for Composer 1.5 and 38.0 for Composer 1. In comparison, Claude Opus 4.6 scored 58.2 and GPT-5.4 Thinking scored 63.9. Cursor is therefore presenting Composer 2 as competitive with top coding models, not as the clear leader in every test.

Cursor also says in the same benchmark post that Composer 2 scored 61.7 on Terminal-Bench 2.0, with Claude Opus 4.6 at 58.0 in the Claude Code harness and GPT-5.4 at 75.1.

On SWE-bench Multilingual, Cursor says the model reached 73.7, behind Claude Opus 4.6 at 77.8 but ahead of older Composer releases. Those results put Cursor into the same conversation as Claude Sonnet 4.6, even if GPT-5.4 still leads on some measures.

Model	CursorBench	Terminal-Bench 2.0	SWE-bench Multilingual
Composer 2	61.3	61.7	73.7
Composer 1.5	44.2	47.9	65.9
Composer 1	38.0	40.0	56.9

Cursor says that its Terminal-Bench measurements used the official Harbor framework with default settings averaged across five runs. As a result, the larger message is commercial as much as technical. Composer 2 appears close enough to top-tier systems that pricing may matter as much as leaderboard position.

According to Cursor, the standard Composer 2 model costs $0.50 per million input tokens and $2.50 per million output tokens, while the default Composer 2 Fast tier costs $1.50 and $7.50. The pricing gap is about 86% versus Composer 1.5. That is the sharper claim behind the benchmark table: near-frontier coding performance may be available at materially lower cost.

Why Cursor Needs This Bet to Work

The strategy looks more significant when paired with Cursor’s scale. Cursor now has 1 million daily users and around 50,000 enterprise customers. At that size, even small inference savings can have a meaningful financial impact.

Cost pressure sits at the center of the move. Cursor says a Claude Code subscription priced at $200 a month can translate into roughly $5,000 in compute costs. Cursor also says consumer subscriptions run at negative margins while enterprise contracts support the business.

Seen that way, a cheaper in-house coding model is more than a branding exercise. It could help protect gross margins while reducing dependence on suppliers that also compete with Cursor. That business logic helps explain why the company is pushing so hard to own more of its stack.

The fundraising backdrop raises the stakes further. A June 2025 funding round valued Anysphere at about $9.9 billion. Valuation talks later pointed to a $29.3 billion valuation last November and discussions around a roughly $50 billion round. That progression helps explain why investors may want proof that Cursor can control more of the stack instead of merely reselling access to models built elsewhere.

Composer 2 is being launched by a company already operating at meaningful scale rather than testing a side project.

That dynamic also sharpens the competitive tension. OpenAI is part of Cursor’s cap table through an OpenAI investor tie, even as Cursor tries to compete more directly with OpenAI and Anthropic in coding tools. A specialized internal model gives Cursor another lever if supplier pricing, rate limits or product direction become less favorable.

Where Composer 2 Fits in Cursor’s Broader Shift

Cursor’s 2023 coding assistant launch marked the company’s first AI coding product.

Composer 2 is the third Composer release in five months and the first to use continuous pre-training rather than only reinforcement learning on top of an existing base model. Building on that cadence, Cursor looks increasingly like a company moving from wrapper toward model builder.

Distribution still remains part of the plan. Composer 2 is currently available as an early alpha version via the main Cursor AI code editor. No separate model platforms rollout are currently planned.

That choice keeps the product tied closely to Cursor’s own workflow. It also suggests the company sees control of the environment as nearly as important as control of the base model. In that sense, Composer 2 is both a model launch and a distribution strategy.

Even so, Cursor is not cutting itself off from outside providers. Cursor still integrates multiple AI models, including those from OpenAI and Anthropic. Owning a differentiated in-house option could therefore give it more leverage if outside model pricing or product priorities shift.

Looking ahead, Cursor’s launch may end up looking less like the main event than the opening move. The harder test is whether a company still integrated with OpenAI and Anthropic can use specialization, tighter product integration and lower costs to keep developers inside its ecosystem as the market grows more crowded.

Cursor Unveils Composer 2 AI Model Beating GPT-4.5 and Claude in Efficiency

How Composer 2 Is Different

Recent Articles

YouTube Lets Users Turn Off Shorts with New Time Controls

Logitech Pro X2 Superstrike Review: This Gaming Mouse Has No Clicks and It’s Perfect

How to add and play with friends

21-year-old Polish Woman Fixed a 20-year-old Linux Bug!

We’ve Seen the First 18 Minutes of ‘The Mandalorian and Grogu’

Related Stories