Model Catalogue — Stav AI

Anthropic · claude-fable

Claude Fable 5

Anthropic's frontier Mythos-class model with always-on adaptive reasoning and a one-million-token context window, built for deep code and research analysis rather than real-time chat.

€10.00/€50.00/M in·out

Anthropic · claude-opus-4

Claude Opus 4.8

Anthropic's most capable flagship model as of May 2026, built for complex agentic workflows, graduate-level reasoning, and long-context analysis — with a one-million-token context window and opt-in extended thinking.

€13.80/€69.00/M in·out

Google · gemini

Gemini 3.5 Flash

Google DeepMind's frontier-tier multimodal Flash model with a 1M-token context window, optional extended thinking, and native support for text, image, video, audio, and file inputs — built for complex interactive workloads.

€1.50/€9.00/M in·out

Google · gemini

Gemini 3.1 Flash Lite

Google DeepMind's cost-optimised multimodal language model with a 1M-token context window, real-time latency, and optional extended thinking — built for high-throughput agentic and instruction-following workloads.

€0.25/€1.50/M in·out

OpenAI · openai-gpt-5

gpt-5.5

OpenAI's frontier agentic instruct model with a 1M+ token context window, built for complex multi-step reasoning, long-document analysis, and autonomous tool orchestration.

€5.00/€30.00/M in·out

Anthropic · claude-opus-4

Claude Opus 4.7

Anthropic's iterative frontier flagship with a one-million-token context window, graduate-level reasoning (GPQA 0.914), enhanced vision, and optional extended thinking — built for complex long-horizon tasks.

€13.80/€69.00/M in·out

OpenAI · openai-gpt-5

gpt-5.4-mini

OpenAI's efficient GPT-5-generation model with a 400K context window, tool calling, structured output, and optional reasoning traces — optimised for high-throughput agentic and coding workloads.

€0.75/€4.50/M in·out

Mistral Ai · mistral-small

mistral-small-2603

Mistral AI's 119B open-weights MoE model unifying instruct, on-demand reasoning, vision, and coding in a single Apache 2.0 checkpoint, with a 262K-token context window and interactive-class latency.

€0.09/€0.28/M in·out

OpenAI · openai-gpt-5

gpt-5.4

OpenAI's frontier chain-of-thought reasoning model, combining GPT-5 and Codex capabilities in a single system with a 1M-token context window and native vision and document input.

€2.50/€15.00/M in·out

Google · gemini

Gemini 3.1 Pro Preview

Google's frontier reasoning model with a 1M-token context window, dynamic chain-of-thought thinking, and full multimodal input — built for complex, long-horizon inference tasks.

€2.00/€12.00/M in·out

Anthropic · claude-sonnet-4

Claude Sonnet 4.6

Anthropic's hybrid reasoning Sonnet model — frontier-grade GPQA scores, API-controlled extended thinking, and a 1M-token context window in beta, delivered at interactive speeds.

€2.76/€13.80/M in·out

Anthropic · claude-opus-4

Claude Opus 4.6

Anthropic's flagship reasoning model with a 1M-token context window (beta), adaptive thinking via /effort controls, and top benchmark results on Humanity's Last Exam, Terminal-Bench 2.0, BrowseComp, and GDPval-AA. Optimised for complex agentic and long-context workloads.

€13.80/€69.00/M in·out

OpenAI · openai-gpt-5

gpt-5.3-codex

OpenAI's frontier agentic coding model combining top-tier software-engineering performance with broad professional reasoning, a 400K-token context window, and native support for reasoning traces and tool orchestration.

€1.75/€14.00/M in·out

Google · gemini

Gemini 3 Flash Preview

Google's frontier-tier Flash model with a 1M-token context, native multimodal inputs, and opt-in extended thinking — built for fast agentic workflows that need near-Pro reasoning.

€0.50/€3.00/M in·out

Mistral Ai · mistral-large

mistral-large-2512

Mistral AI's December 2025 open-weight flagship — a sparse MoE with 675B total / 41B active parameters, 256k context, native vision, and Apache 2.0 licence — built for complex instruction-following, agentic workflows, and multilingual deployments.

€1.84/€5.52/M in·out

Anthropic · claude-opus-4

Claude Opus 4.5

Anthropic's frontier Claude Opus 4.5, built for complex coding, research, and long-document reasoning with optional extended thinking.

€5.00/€25.00/M in·out

Anthropic · claude-haiku-4

Claude Haiku 4.5

Anthropic's fastest Claude 4 model — built for high-throughput agentic and coding workloads, with a 200K-token context window, extended thinking support, vision, and strong safety benchmarks.

€0.74/€3.68/M in·out

Anthropic · claude-sonnet-4

Claude Sonnet 4.5

Anthropic's instruct-tuned Claude Sonnet 4.5 for agentic coding, computer use, and long-context reasoning — now on a 200K context window and heading toward retirement.

€2.76/€13.80/M in·out

OpenAI · openai-gpt-5

gpt-5-codex

OpenAI's first-generation GPT-5 coding variant, built for agentic workflows. Offers a 400,000-token context window, configurable reasoning effort, and full tool-calling support via the Responses API — designed for developers who need sustained, multi-step code reasoning at frontier quality.

€1.25/€10.00/M in·out

Mistral Ai · magistral

magistral-medium-2509

Magistral Medium 2509 is Mistral AI's enterprise reasoning model, featuring always-on chain-of-thought Inference, vision input, and a 128k context window — available exclusively via the La Plateforme API.

€2.00/€5.00/M in·out

Mistral Ai · mistral-medium

mistral-medium-2508

Mistral Medium 3.1 is a multimodal Mixture-of-Experts model from Mistral AI with a 128K context window, combining text and image understanding with function calling, OCR, and document Q&A in an efficient mid-tier interactive package.

€0.40/€2.00/M in·out

OpenAI · openai-gpt-5

gpt-5-mini

OpenAI's compact GPT-5 variant — multimodal input, 400K context, optional extended thinking, and tool calling — tuned for interactive, medium-complexity workloads at a lighter weight than full GPT-5.

€0.25/€2.00/M in·out

OpenAI · openai-gpt-5

gpt-5-nano-2025-08-07

OpenAI's fastest GPT-5 tier — optimised for high-volume classification, summarisation, and short-completion tasks with a 400K context window, vision and file input, and optional reasoning traces.

€0.05/€0.40/M in·out

Anthropic · claude-opus-4

Claude Opus 4.1

Anthropic's agentic coding and research model with 74.5% SWE-bench Verified, 200K context, and extended thinking up to 64K tokens.

€13.80/€69.00/M in·out

Mistral Ai · codestral

codestral-2508

Mistral AI's closed-weight code specialist, built for fill-in-the-middle completion, code correction, and test generation with a 256K-token context window and interactive-latency API access.

€0.28/€0.83/M in·out

Google · gemini-2.5-flash

Gemini 2.5 Flash-Lite

Google's fastest Gemini 2.5 variant — a multimodal, 1M-context instruct model built for realtime, high-throughput workloads with optional reasoning mode.

€0.09/€0.37/M in·out

Google · gemini

Gemini 2.5 Pro

Google's flagship reasoning model with a one-million-token context window, adaptive chain-of-thought thinking, and native support for multimodal inputs, tool calling, web search grounding, and sandboxed code execution.

€1.15/€9.20/M in·out

Google · gemini

Gemini 2.5 Flash

Google DeepMind's efficiency-tier multimodal model with a 1M-token context window, optional chain-of-thought thinking, and native support for text, image, audio, video, and file inputs.

€0.28/€2.30/M in·out

OpenAI · gpt-4o-mini

gpt-4o-mini

OpenAI's compact multimodal model balancing capable instruction-following with realtime latency — the practical default for high-throughput text and vision workloads.

€0.14/€0.55/M in·out

Mistral Ai · mistral-nemo

open-mistral-nemo

A 12B open-weights instruction model co-developed by Mistral AI and NVIDIA, featuring a 131k-token context window, the efficient Tekken tokenizer, native tool calling, and multilingual support across nine languages — available under Apache 2.0 for unrestricted self-hosted inference.

€0.13/€0.13/M in·out

OpenAI · gpt-4o

gpt-4o

OpenAI's multimodal flagship with text and image input, 128K context, structured outputs, and interactive latency for complex instruction-following and vision tasks.

€2.30/€9.20/M in·out

OpenAI · gpt-4-turbo

gpt-4-turbo

OpenAI's GPT-4 Turbo is a high-quality instruction-tuned model with a 128K context window, image input support, and full tool-calling capabilities — well-suited for complex, long-context, and multimodal tasks served at interactive latency.

€9.20/€27.60/M in·out

OpenAI · gpt-3.5-turbo

gpt-3.5-turbo-instruct

OpenAI's instruction-tuned completion model for the legacy /v1/completions endpoint — a fast, mid-tier option for short-form text generation and prompt-completion workflows that don't require chat structure or tool calling.

€0.46/€1.38/M in·out

OpenAI · gpt-4

gpt-4

OpenAI's GPT-4 is a high-quality, instruction-tuned language model well-suited to complex professional tasks, tool-calling workflows, and safety-sensitive applications, served via a text-in/text-out API with an 8K context window.

€30.00/€60.00/M in·out

OpenAI · gpt-3.5-turbo

gpt-3.5-turbo

OpenAI's chat-optimised text model built for realtime, high-volume Inference, with a 16k-token context window, tool calling, and JSON mode at a mid-tier quality level.

€0.50/€1.50/M in·out

Every model worth running, side by side.

Claude Fable 5 — Anthropic's frontier Mythos-class model with always-on adaptive reasoning and a one-million-token context window, built for deep code and research analysis rather than real-time chat.