Reasoning
Control model reasoning depth and monitor token costs in complex tasks.
POST/v1/responses
| Name | Type | Required | Description |
|---|---|---|---|
reasoning.effort | string | low | medium | high | |
reasoning.summary | string | none | auto for reasoning summaries | |
max_output_tokens | integer | Cap for response size after reasoning |
Reasoning Request
JSON
{
"model": "openai/o3-mini",
"input": "Plan a zero-downtime migration strategy for a payments database.",
"reasoning": {
"effort": "high",
"summary": "auto"
},
"max_output_tokens": 600
}reasoning-usage.json
{
"usage": {
"input_tokens": 35,
"output_tokens": 412,
"reasoning_tokens": 228,
"total_tokens": 675
}
}reasoning.summary Provider Support
The
reasoning.summary parameter is accepted in requests but may not be passed through to all upstream providers. If the target model or provider does not support reasoning summaries, the parameter is silently ignored.Reasoning Cost Tradeoff
Higher reasoning effort improves reliability for difficult tasks but increases latency and token usage. Start at
medium and scale only when needed.