Cut AI inference costs by up to 85%
TheRouter routes AI requests to the most cost-effective provider automatically. SiliconFlow provides DeepSeek R1 at $0.55/$2.19 per MTok versus Bedrock's $3.50/$17.50 — 84% cheaper. DeepSeek V3.2 costs $0.14/$0.28 versus $0.90/$2.70 on Bedrock — 85% cheaper. No code changes required. The standard model name stays the same.
TheRouter automatically routes to the lowest-cost provider for each model. When SiliconFlow is available for DeepSeek and Qwen models, your costs drop 40–85% versus US-region providers — with automatic failover and zero code changes.
Real cost savings — live data
SiliconFlow prices vs AWS Bedrock for the same models. Prices in $/MTok (input / output).
| Model | Bedrock (primary) | SiliconFlow (optimized) | Savings |
|---|---|---|---|
| deepseek/deepseek-r1 | $3.50 / $17.50 | $0.55 / $2.19 | 84% |
| deepseek/deepseek-v3.2 | $0.90 / $2.70 | $0.14 / $0.28 | 85% |
| qwen/qwen3-235b | $0.47 / $1.39 | $0.14 / $0.55 | 60% |
| qwen/qwen3-32b | $0.20 / $0.80 | $0.03 / $0.07 | 85% |
Prices as of March 2026. Bedrock on-demand pricing. View full pricing →
How cost routing works
Your request arrives
You call api.therouter.ai/v1/chat/completions with a standard model ID like deepseek/deepseek-v3.2.
Route selection
TheRouter checks provider health and cost. If SiliconFlow is healthy, it routes there at priority 1 (lower cost). Bedrock at priority 0 is the primary fallback.
Normalized response
The response arrives with the standard model name in the model field — never the provider-specific name. reasoning_content is preserved for DeepSeek R1.
Zero code changes
Switch from OpenAI to cost-optimized routing with one line change:
# Before: OpenAI directly
# client = OpenAI(api_key="sk-...", base_url="https://api.openai.com/v1")
# After: TheRouter with cost optimization
client = OpenAI(
api_key="YOUR_THEROUTER_API_KEY", # ← only change
base_url="https://api.therouter.ai/v1",
)
# Same code, 85% lower cost for DeepSeek models
response = client.chat.completions.create(
model="deepseek/deepseek-v3.2",
messages=[{"role": "user", "content": "Summarize this document."}],
)Common questions
How does TheRouter reduce AI inference costs?
TheRouter maintains multiple provider routes for each model and automatically selects the most cost-effective one. For shared models like DeepSeek and Qwen, SiliconFlow's China-region infrastructure offers 40–85% lower pricing than US providers like AWS Bedrock.
Will my application break when the provider switches?
No. TheRouter normalizes responses across all providers to a consistent OpenAI-compatible format. The model name in the response always reflects the standard model ID you requested — never the internal provider or upstream model name.
Can I control which provider is used?
Yes. You can set a provider preference per API key from the dashboard — Auto (cost-optimized routing), US-optimized (Bedrock primary), or China-optimized (SiliconFlow primary). The default auto mode picks the lowest-cost available route.
Does cost optimization affect response quality?
No. TheRouter routes to the same model via a different inference provider. The model weights and capabilities are identical — only the infrastructure differs. You get the same DeepSeek R1 output whether it runs on Bedrock or SiliconFlow.