Anthropic launches Claude Opus 4.5, claiming it’s the world’s best model for coding


Today, Anthropic announced Claude Opus 4.5, its newest frontier model focused on coding, agents, and computer use. The company also claims that Opus 4.5 is the best coding model in the world according to the SWE-bench Verified benchmark, and it performs meaningfully better than Sonnet 4.5 in other real-world tasks like deep research, editing slides, and spreadsheets.

Claude Opus 45

As you can notice in the table above, the Opus 4.5 model scores a record 80.9% in SWE-Bench Verified, beating the recently released Gemini 3.0 and GPT-5.1-Codex-Max models. When hiring engineering candidates for the company, Anthropic provides a difficult take-home exam which they also use to test new models as an internal benchmark. The company highlighted that Claude Opus 4.5 scored higher than any human candidate ever within their prescribed 2-hour time limit.

This new Opus 4.5 model is available now on all Claude apps, via API, and on all three major cloud platforms (Azure, GCP, and AWS). Anthropic has also reduced the Claude API pricing. This new frontier-class model costs $5/$25 per million tokens, making Opus-level models accessible to even more users.

In addition to improving performance, Anthropic has also made the model more efficient than before. Claude Opus 4.5 now uses dramatically fewer tokens than its predecessors, including Opus 4.1, to achieve the same or better results. Basically, the model does less backtracking, less redundant exploration, and less verbose reasoning. For example, Opus 4.5 at Medium reasoning effort can beat SWE-bench Verified scores of Sonnet 4.5 with 76% fewer output tokens. At High reasoning effort level, Opus 4.5 beats Sonnet 4.5 by 4.3% while using 48% fewer tokens.

Following OpenAI’s footsteps, the Claude API now allows a reasoning effort parameter which will allow developers to decide on speed and thinking capability. Finally, with Opus 4.5, Claude Code can now build more accurate plans and execute more thoroughly. Also, it can ask clarifying questions upfront, then build an editable plan.md file before executing.





Source link

Recent Articles

spot_img

Related Stories