ZDNET’s key takeaways
- Anthropic launched Claude Opus 4.1.
- The model exceeds the predecessor’s performance on complex tasks.
- It is available to paid Claude users, Claude Code, API, Amazon Bedrock, and Google Cloud’s Vertex AI.
In May, Anthropic released Claude Opus 4, which the company dubbed its most powerful model yet and the best coding model in the world. Only three months later, Anthropic is upping the ante further by launching the highly anticipated Claude Opus 4.1, which now takes its predecessor’s crown as Anthropic’s most advanced model.
The Opus family of models is the company’s most advanced, intelligent AI models geared toward tackling complex problems. As a result, Claude Opus 4.1, released on Tuesday, excels at those tasks and can even one-up its predecessor on agentic tasks, real-world coding, and reasoning, according to Anthropic.
The model also comes as the industry is expecting the launch of OpenAI’s GPT-5 soon.
Also: OpenAI could launch GPT-5 any minute now – what to expect
How does Claude Opus 4.1 perform?
One of the most impressive use cases of Claude Opus 4 was its performance on the SWE-bench Verified, a human-filtered subset of the SWE-bench, a benchmark that evaluates LLMs’ abilities to solve real-world software engineering tasks sourced from GitHub. Claude Opus 4’s performance on the SWE-bench Verified supported the claim that it was the “best coding model in the world.” As seen in the post above, Opus 4.1 performed even higher.
Claude Opus 4.1 also swept its preceding models across the benchmark board, including the MMMLU, which tests for multilingual capabilities; AIME 2025, which tests for rigor on high school match competition questions; GPQA, which tests for performance on graduate-level reasoning prompts; and more. When pinned against competitors’ reasoning models, including OpenAI o3 and Gemini 2.5 Pro, it outperforms them in various benchmarks, including SWE-bench Verified.
–>