Reasoning Tokens

Use reasoning intentionally for quality and cost control

Reasoning-enabled models can improve answer quality for complex tasks, but they consume extra output tokens. Treat reasoning as a controllable budget, not a default.

Reference payload

Use this baseline request shape and adapt model, provider sort strategy, and token limits to your workload.

request.json

{
  "model": "anthropic/claude-sonnet-4.5",
  "reasoning": {
    "max_tokens": 2000,
    "exclude": false
  }
}

Configuration examples

TheRouter.ai keeps request semantics consistent across providers, so you can tune behavior without rewriting your app layer.

TypeScript

const payload = {
  model: "openai/o3",
  messages: [{ role: "user", content: "Solve this scheduling problem" }],
  reasoning: { effort: "high", exclude: false },
};

Production note

Operate with guardrails

Higher reasoning budgets can reduce hallucinations on hard tasks, but can significantly increase output-token spend.

Use the activity feed and usage exports to validate that these settings improve reliability and cost in your real traffic mix.