Routing Modes

Instead of hardcoding a model, set the model parameter to a routing mode and Arena selects the best AI model in real time using live leaderboard rankings — built on millions of human preference votes. Or pass a specific model name to bypass routing entirely.

Choose a routing mode

Auto — Balanced

Best model for quality, speed, and cost. Good default for most use cases.

Best for: General-purpose applications, chatbots, and productivity tools.

curl https://api.preview.arena.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $GATEWAY_API_KEY" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Mode reference

Model value	Strategy	Description
`"auto"`	Balanced	Best model for quality, speed, and cost. Good default for most use cases.
`"fast"`	Low Latency	Prefers faster models while maintaining a quality floor.
`"claude-sonnet-4-6"`	Direct	Sends to a specific model — no routing. See all models

Session continuity

When X-Arena-Session-Mode is omitted, auto reuses the first resolved model for later turns in the same session.

Header value	Behavior
`omitted`	For `auto`, route the first turn and reuse that model for continued turns.
`X-Arena-Session-Mode: per_turn`	Re-route each continued `auto` turn.
`X-Arena-Session-Mode: sticky`	Explicit first-turn routing for the whole session.
`X-Arena-Session-Mode: sticky_tools`	Reuse the first resolved model only while a tool loop is active.

Response headers

When using a routing mode, the gateway returns these headers to show which model was selected:

Header	Description
`X-Arena-Routing-Mode`	The routing mode used (auto, fast)
`X-Arena-Resolved-Model`	The model that was actually selected to handle the request
`X-Arena-Routing-Tier`	Ranking data tier used (live, live-cached, cached, static, session-pinned)
`X-Arena-Trace-ID`	Unique request ID for debugging — include this in support requests