Loading the catalogue…
Loading the catalogue…
One OpenAI-compatible API sits in front of every model. Here’s what happens to each request — classified, checked against your policy, routed to the right model, and failed over if a provider is down.
Point the OpenAI SDK at api.stav.ai/v1. No new client, no rewrite.
Each request is scored for complexity and sensitivity — no prompt is stored.
Your sovereignty rules are enforced before dispatch, not after.
Sent to the cheapest capable model, with failover if a provider is down.
The router picks by cost, capability and your policy — so routine work runs on cheap sovereign models and only the hard requests reach the frontier.
Classification, extraction and summarisation go to fast sovereign models for fractions of a cent; frontier only when the task earns it.
Unless you escalate, requests stay on EU-hosted models under European jurisdiction — enforced before dispatch, not after.
Use model="auto" for zero-config routing, pin @team/sovereign for regulated workloads, or define your own rules — below.
Beyond `auto`, create named, addressable routers — `@team/sovereign`, `@team/coding` — each with rules you define. Your team’s default is what `model="auto"` resolves to.
sovereign-only, or permit frontier for specific cases@team/nameAlready have a contract with a provider? Attach your own key to any model in the catalogue. Matching requests route through your account — your pricing, your rate limits, your capacity — and the provider bills you directly. Stav still handles routing, failover, logging and compliance.
The provider bills you directly at your negotiated rate — Stav never marks up your tokens. A small platform fee covers routing and governance.
Point a router at byok_only to force your key, or let it fall back to Stav’s pooled capacity when your quota runs out.
BYOK traffic flows through the same logs, analytics and compliance trail — and inherits the provider’s sovereignty label.
When a provider is rate-limited or down, the gateway re-routes to the next capable model automatically.
A 429 or 5xx from one provider transparently retries on the next model that satisfies your policy.
Sovereign models run on dedicated EU GPUs; commercial models pool across providers for headroom.
One key, every model, full sovereignty. Start in minutes with the SDK you already use.