Anthropic has introduced the discharge of Claude Sonnet 4.5, which it claims is the “finest coding mannequin on the earth” and the “strongest mannequin for constructing complicated brokers.”
It achieves a 77.2% on the SWE-bench for software program engineering, in comparison with 74.5% for Claude Opus 4.1 and 72.7% for Claude Sonnet 4. For exterior comparability, GPT-5 Codex scored at 74.5%, GPT-5 scored 72.8%, and Gemini 2.5 Professional scored 67.2%.
Moreover, it leads within the OSWorld benchmark, which checks AI fashions on real-world laptop duties. It scored 61.4% on that benchmark, beating out Claude Sonnet 4, which scored 42.2%.
“Sonnet 4.5 can produce near-instant responses or prolonged, step-by-step considering that’s made seen to the consumer,” Anthropic says.
In response to Anthropic, Claude Sonnet 4.5 additionally reveals higher domain-specific information and reasoning within the fields of finance, legislation, and drugs.
This mannequin performs higher on security and alignment evaluations, the corporate claims. It reveals a discount in behaviors corresponding to sycophancy, deception, power-seeking, and the tendency to encourage delusional considering, in addition to displaying progress on having the ability to defend towards immediate injection assaults.
The pricing for Claude Sonnet 4.5 is identical as Claude Sonnet 4’s pricing: $3 per million enter tokens and $15 per million output tokens.
Alongside the launch of Claude Sonnet 4.5, Anthropic additionally introduced updates throughout a number of of its merchandise. Claude Code now has checkpoints that enable builders to avoid wasting their progress and roll again to earlier variations. The Claude API acquired a brand new context enhancing function and reminiscence software that permits brokers to run longer and deal with extra complicated duties. Moreover, all Claude apps now have entry to code execution and file creation.
The corporate can be releasing the Claude Agent SDK, which builders can use to construct their very own brokers utilizing the identical infrastructure Anthropic makes use of to energy Claude Code.
“We constructed Claude Code as a result of the software we needed didn’t exist but. The Agent SDK offers you an identical basis to construct one thing simply as succesful for no matter downside you’re fixing,” Anthropic wrote in a weblog submit.
