Loading the catalogue…
Loading the catalogue…
Anthropic's agentic coding and research model with 74.5% SWE-bench Verified, 200K context, and extended thinking up to 64K tokens.
Anthropic's frontier Claude Opus 4.5, built for complex coding, research, and long-document reasoning with optional extended thinking.
Anthropic's flagship reasoning model with a 1M-token context window (beta), adaptive thinking via /effort controls, and top benchmark results on Humanity's Last Exam, Terminal-Bench 2.0, BrowseComp, and GDPval-AA. Optimised for complex agentic and long-context workloads.
Anthropic's iterative frontier flagship with a one-million-token context window, graduate-level reasoning (GPQA 0.914), enhanced vision, and optional extended thinking — built for complex long-horizon tasks.
Anthropic's most capable flagship model as of May 2026, built for complex agentic workflows, graduate-level reasoning, and long-context analysis — with a one-million-token context window and opt-in extended thinking.
Each axis is the mean score across the family’s variants that have been scored on that dimension. Per-axis sample size is shown next to each label — the family currently aggregates up to 5 variants per axis.
Values aggregated across the family’s variants: any variant supporting a capability resolves the family to Supported; flag-driven support resolves to Optional; only when every variant explicitly denies a capability does the family render as Not supported. 15 of 15 capabilities have variant data so far.