How the gateway works.

One OpenAI-compatible API sits in front of every model. Here’s what happens to each request — classified, checked against your policy, routed to the right model, and failed over if a provider is down.

Get API key Read the docs

Lifecycle

From request to response.

Step 1

One endpoint

Point the OpenAI SDK at api.stav.ai/v1. No new client, no rewrite.

Step 2

Classified

Each request is scored for complexity and sensitivity — no prompt is stored.

Step 3

Policy-checked

Your sovereignty rules are enforced before dispatch, not after.

Step 4

Routed & returned

Sent to the cheapest capable model, with failover if a provider is down.

Routing

Each request goes to the right model.

The router picks by cost, capability and your policy — so routine work runs on cheap sovereign models and only the hard requests reach the frontier.

Cheapest capable model

Classification, extraction and summarisation go to fast sovereign models for fractions of a cent; frontier only when the task earns it.

Sovereign by default

Unless you escalate, requests stay on EU-hosted models under European jurisdiction — enforced before dispatch, not after.

Named, addressable routers

Use model="auto" for zero-config routing, pin @team/sovereign for regulated workloads, or define your own rules — below.

Inference Routers

Build your own routing rules.

Beyond `auto`, create named, addressable routers — `@team/sovereign`, `@team/coding` — each with rules you define. Your team’s default is what `model="auto"` resolves to.

Allow-list the exact models a router may use
Set sovereign-only, or permit frontier for specific cases
Route by sensitivity, domain, cost or context length
Define the fallback order, then call it as @team/name

Read the routing docs

Provider BYOK

Or bring your own provider keys.

Already have a contract with a provider? Attach your own key to any model in the catalogue. Matching requests route through your account — your pricing, your rate limits, your capacity — and the provider bills you directly. Stav still handles routing, failover, logging and compliance.

Your account, your terms

The provider bills you directly at your negotiated rate — Stav never marks up your tokens. A small platform fee covers routing and governance.

Routers honour your keys

Point a router at byok_only to force your key, or let it fall back to Stav’s pooled capacity when your quota runs out.

Same governance, inherited sovereignty

BYOK traffic flows through the same logs, analytics and compliance trail — and inherits the provider’s sovereignty label.

Reliability

Failover you don’t manage.

When a provider is rate-limited or down, the gateway re-routes to the next capable model automatically.

Automatic failover

A 429 or 5xx from one provider transparently retries on the next model that satisfies your policy.

Pooled capacity

Sovereign models run on dedicated EU GPUs; commercial models pool across providers for headroom.

Build on a gateway that thinks in Europe.

One key, every model, full sovereignty. Start in minutes with the SDK you already use.

Get API key Read the docs