Reasoning

Control model reasoning depth and monitor token costs in complex tasks.

POST/v1/responses
NameTypeRequiredDescription
reasoning.effort
stringlow | medium | high
reasoning.summary
stringnone | auto for reasoning summaries
max_output_tokens
integerCap for response size after reasoning

Reasoning Request

JSON
{
  "model": "openai/o3-mini",
  "input": "Plan a zero-downtime migration strategy for a payments database.",
  "reasoning": {
    "effort": "high",
    "summary": "auto"
  },
  "max_output_tokens": 600
}
reasoning-usage.json
{
  "usage": {
    "input_tokens": 35,
    "output_tokens": 412,
    "reasoning_tokens": 228,
    "total_tokens": 675
  }
}
reasoning.summary Provider Support
The reasoning.summary parameter is accepted in requests but may not be passed through to all upstream providers. If the target model or provider does not support reasoning summaries, the parameter is silently ignored.
Reasoning Cost Tradeoff
Higher reasoning effort improves reliability for difficult tasks but increases latency and token usage. Start at medium and scale only when needed.