Send Chat Completion Request

Create an OpenAI-compatible chat completion.

POST/v1/chat/completions

Request Parameters

NameTypeRequiredDescription
model
stringPrimary model alias in brand/model format. Required unless models is provided.
models
string[]Fallback model list. Use when model is omitted, or combine with model for explicit fallback chains.
messages
arrayConversation messages. Each item requires role; content may be string or structured content parts. Required unless input is provided.
input
string | arrayResponses API input. String or array of input items. When provided instead of messages, auto-translated to Chat Completions format internally.
stream
booleanOptional. Defaults to false.
temperature
numberSampling temperature. Typical range: 0-2.
max_tokens
integerMaximum output tokens.
top_p
numberNucleus sampling. Typical range: 0-1.
tools
arrayTool definitions.
tool_choice
object | stringTool selection mode.
response_format
objectStructured response constraints.
reasoning
objectReasoning controls.
logprobs
booleanInclude token logprobs.
top_logprobs
integerTop logprobs count.
logit_bias
objectToken bias map.
max_completion_tokens
integerMaximum completion tokens.
stream_options
objectStreaming options. Set stream_options.include_usage=true to receive a separate final usage chunk.
stop
string | string[]Stop sequences.
frequency_penalty
numberFrequency penalty.
presence_penalty
numberPresence penalty.
seed
integerDeterministic sampling seed.
user
stringEnd-user identifier for analytics and abuse controls.

OpenRouter Routing Extensions

NameTypeRequiredDescription
models
string[]Model fallback chain. The gateway composes a route and tries models in order.
provider
objectProvider preferences: allow_fallbacks, order, only, ignore.
route
stringRouting strategy: fallback or sort.

Examples (Non-Streaming)

curl
curl -X POST https://api.therouter.ai/v1/chat/completions
  -H "Authorization: Bearer $THEROUTER_API_KEY"
  -H "Content-Type: application/json"
  -d '{
    "model": "anthropic/claude-sonnet-4.5",
    "messages": [{"role": "user", "content": "Write a haiku about routing."}],
    "temperature": 0.7,
    "max_tokens": 180,
    "top_p": 1,
    "presence_penalty": 0,
    "frequency_penalty": 0,
    "stop": ["\n\n"],
    "seed": 42,
    "models": ["anthropic/claude-sonnet-4.5", "openai/gpt-4o"],
    "provider": {"allow_fallbacks": true, "order": ["openai-api"]},
    "route": "fallback",
    "stream": false
  }'

Examples (Streaming)

curl
curl --no-buffer -X POST https://api.therouter.ai/v1/chat/completions
  -H "Authorization: Bearer $THEROUTER_API_KEY"
  -H "Content-Type: application/json"
  -d '{
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Stream a short answer."}],
    "stream": true,
    "stream_options": {"include_usage": true}
  }'

Streaming Behavior

text
data: {"id":"chatcmpl_...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl_...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: {"id":"chatcmpl_...","object":"chat.completion.chunk","choices":[],"usage":{"prompt_tokens":10,"completion_tokens":4,"total_tokens":14}}

data: [DONE]

Usage is emitted as a separate final chunk with choices: [] only when stream_options.include_usage is true.

Error Shape

json
{
  "error": {
    "message": "Missing required parameter: model",
    "type": "invalid_request_error",
    "code": 400,
    "param": "model"
  }
}

Error bodies do not include an error.metadata object. Request IDs are returned via x-request-id response header.

Response Headers

NameTypeRequiredDescription
X-Cache
response headerCache status: HIT or MISS.
X-Cache-Model
response headerUpstream model used for cache hit.
x-request-id
response headerRequest trace ID (UUIDv4).
X-RateLimit-Limit
response headerRequest quota window limit.
X-RateLimit-Remaining
response headerRemaining requests in the current window.
X-RateLimit-Reset
response headerUnix epoch reset timestamp.
Retry-After
response headerPresent on 429 responses. Value is seconds.
Model Requirement
Provide either model or models. Requests without both are invalid.