AI5 views6 min read

Anthropic Releases Claude Sonnet 4.5 AI Model

Anthropic has released Claude Sonnet 4.5, a new AI model with enhanced coding, computer use, and reasoning capabilities, available now at the same price.

Kevin Bryant
By
Kevin Bryant

Kevin Bryant is a technology correspondent for Neurozzio, focusing on the intersection of software development, artificial intelligence, and business strategy. He reports on developer tools, APIs, and the companies building foundational AI infrastructure.

Author Profile
Anthropic Releases Claude Sonnet 4.5 AI Model

Anthropic has announced the release of Claude Sonnet 4.5, a new artificial intelligence model with advanced capabilities in coding, computer operation, and complex reasoning. The model is now accessible to developers through the Claude API and is accompanied by a suite of updates to the company's developer tools and applications.

Key Takeaways

  • Anthropic has launched Claude Sonnet 4.5, a new AI model available immediately via API at the same price as its predecessor.
  • The model demonstrates significant performance improvements in coding, computer use, and reasoning benchmarks.
  • New developer tools, including the Claude Agent SDK, have been released to allow others to build complex AI agents.
  • Enhanced safety protocols, under AI Safety Level 3, are in place, with improved alignment and reduced harmful behaviors.

Model Capabilities and Performance

Claude Sonnet 4.5 has shown notable improvements across several industry benchmarks. The company states it is a state-of-the-art model on the SWE-bench Verified evaluation, a test that measures real-world software coding abilities. In practical applications, the model has reportedly maintained focus on complex, multi-step tasks for over 30 hours.

The new model also leads in computer operation tasks. On the OSWorld benchmark, which evaluates an AI's ability to perform real-world tasks on a computer, Sonnet 4.5 achieved a score of 61.4%. This represents a significant increase from the 42.2% score held by the previous model, Claude Sonnet 4, just four months prior.

Benchmark Advancement

Claude Sonnet 4.5's score of 61.4% on the OSWorld benchmark is a nearly 20 percentage point increase over its predecessor, highlighting rapid advancements in the model's ability to interact with computer operating systems to complete tasks.

In addition to coding and computer use, Anthropic reports that experts in specialized fields such as finance, law, medicine, and STEM have observed improved domain-specific knowledge and reasoning in Sonnet 4.5 compared to earlier versions.

New Tools for Developers and Users

The release of Claude Sonnet 4.5 is accompanied by several upgrades to Anthropic's product ecosystem. These updates are designed to enhance functionality for both developers and general users.

Updates to Claude Products

For users of Claude Code, Anthropic has introduced a "checkpoints" feature. This allows users to save their progress and revert to a previous state if needed. The platform also features a refreshed terminal interface and a new native extension for the popular code editor, VS Code.

The Claude API has been updated with a new context editing feature and a memory tool. These additions are intended to enable AI agents to run for longer periods and manage more complex tasks. For users of the Claude web and mobile apps, new capabilities include direct code execution and the creation of files like spreadsheets, presentations, and documents within a conversation.

What is the Claude Agent SDK?

Anthropic is also releasing the Claude Agent SDK, which provides developers with the same foundational infrastructure the company uses for its own products like Claude Code. This toolkit is designed to help developers build their own sophisticated AI agents capable of handling complex, long-running tasks across various domains, not just coding.

Early feedback from industry partners has been positive. Michael Truell, CEO of Cursor, noted, "We're seeing state-of-the-art coding performance from Claude Sonnet 4.5, with significant improvements on longer horizon tasks."

"Claude Sonnet 4.5 amplifies GitHub Copilot's core strengths. Our initial evals show significant improvements in multi-step reasoning and code comprehension—enabling Copilot's agentic experiences to handle complex, codebase-spanning tasks better," said Mario Rodriguez, Chief Product Officer at GitHub.

Focus on Safety and Alignment

Anthropic describes Claude Sonnet 4.5 as its "most aligned frontier model yet." The company has implemented extensive safety training to reduce undesirable behaviors such as deception, power-seeking, and providing sycophantic responses. The model also features improved defenses against prompt injection attacks, a significant security concern for agentic AI systems.

The model operates under the company's AI Safety Level 3 (ASL-3) framework. This includes safety filters, known as classifiers, that are designed to detect and block potentially dangerous inputs and outputs, with a specific focus on risks related to chemical, biological, radiological, and nuclear (CBRN) materials.

Anthropic acknowledges that these classifiers may occasionally flag harmless content incorrectly, an issue known as a false positive. To mitigate this, users whose conversations are interrupted can easily continue their task with Claude Sonnet 4, a model considered to have a lower CBRN risk. The company reports it has reduced these false positives by a factor of ten since they were first introduced.

Availability and Pricing

Claude Sonnet 4.5 is available globally for developers starting today. It can be accessed through the API using the `claude-sonnet-4-5` model identifier. The pricing structure remains unchanged from the previous version, Claude Sonnet 4, at $3 per million input tokens and $15 per million output tokens.

The associated updates to Claude Code and the Claude Developer Platform are available to all respective users. The new in-app features for code execution and file creation are accessible to users on all paid plans.

Limited Research Preview

Alongside the main release, Anthropic is offering a temporary research preview called "Imagine with Claude." This five-day experiment, available to Max subscribers, demonstrates the model's ability to generate software in real time based on user requests, without pre-written code.

The company recommends that all users upgrade to Claude Sonnet 4.5, positioning it as a direct replacement for previous models that offers significantly improved performance for the same cost.