Routing Modes

Instead of hardcoding a model, set the model parameter to a routing mode and Arena selects the best AI model in real time using live leaderboard rankings — built on millions of human preference votes. Or pass a specific model name to bypass routing entirely.

Choose a routing mode

Auto Balanced

Best model for quality, speed, and cost. Good default for most use cases.

Best for: General-purpose applications, chatbots, and productivity tools.

curl https://api.preview.arena.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $GATEWAY_API_KEY" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Mode reference

Model valueStrategyDescription
"auto"BalancedBest model for quality, speed, and cost. Good default for most use cases.
"fast"Low LatencyPrefers faster models while maintaining a quality floor.
"claude-sonnet-4-6"DirectSends to a specific model — no routing. See all models

Session continuity

When X-Arena-Session-Mode is omitted, auto reuses the first resolved model for later turns in the same session.

Header valueBehavior
omittedFor auto, route the first turn and reuse that model for continued turns.
X-Arena-Session-Mode: per_turnRe-route each continued auto turn.
X-Arena-Session-Mode: stickyExplicit first-turn routing for the whole session.
X-Arena-Session-Mode: sticky_toolsReuse the first resolved model only while a tool loop is active.

Response headers

When using a routing mode, the gateway returns these headers to show which model was selected:

HeaderDescription
X-Arena-Routing-ModeThe routing mode used (auto, fast)
X-Arena-Resolved-ModelThe model that was actually selected to handle the request
X-Arena-Routing-TierRanking data tier used (live, live-cached, cached, static, session-pinned)
X-Arena-Trace-IDUnique request ID for debugging — include this in support requests
Arena API — Dashboard