Provider profile: SiliconFlow — SiliconFlow provides optimized inference for leading Chinese open-source models — DeepSeek R1, V3.2, and the Qwen3 series — deployed on Aliyun infrastructure for 5-30ms latency from China versus 200-400ms from AWS. The cheapest way to run DeepSeek and Qwen at scale.
SiliconFlow
CN (Aliyun)8 modelsDeepSeek & Qwen — 40-80% lower cost, China-region optimized
SiliconFlow provides optimized inference for leading Chinese open-source models — DeepSeek R1, V3.2, and the Qwen3 series — deployed on Aliyun infrastructure for 5-30ms latency from China versus 200-400ms from AWS. The cheapest way to run DeepSeek and Qwen at scale.
- ✓40-80% lower cost vs AWS Bedrock for DeepSeek and Qwen models
- ✓5-30ms latency from China vs 200-400ms from US data centers
- ✓OpenAI-compatible API — drop in your existing code
- ✓Deployed on Aliyun for China regulatory compliance
Quickstart
from openai import OpenAI
client = OpenAI(
base_url="https://api.therouter.ai/v1",
api_key="YOUR_THEROUTER_KEY",
)
response = client.chat.completions.create(
model="deepseek/deepseek-r1",
messages=[{"role": "user", "content": "Explain quantum entanglement"}],
max_tokens=512,
)
print(response.choices[0].message.content)Models
Frequently Asked Questions
Why use TheRouter for SiliconFlow instead of calling SiliconFlow directly?
TheRouter adds automatic failover (SiliconFlow fails → Bedrock takes over), a single API key for all providers, usage analytics, spend controls, and team governance — all without changing your code.
How much cheaper is SiliconFlow compared to AWS Bedrock for DeepSeek models?
SiliconFlow provides DeepSeek R1 and V3.2 at 40-80% lower cost than running the equivalent model through AWS Bedrock. Exact savings depend on your input/output token mix — check the pricing page for current rates.
Does SiliconFlow support streaming and function calling?
Yes. SiliconFlow's API is OpenAI-compatible and supports streaming, function calling (tools), and reasoning content passthrough. The reasoning_content field from DeepSeek R1 is preserved end-to-end.
What region is SiliconFlow deployed in?
SiliconFlow's inference servers are in mainland China. TheRouter's SiliconFlow provider service runs on Aliyun (Alibaba Cloud) for 5-30ms latency from Chinese users, compared to 200-400ms from AWS us-east-2.