Provider profile: SiliconFlowSiliconFlow provides optimized inference for leading Chinese open-source models — DeepSeek R1, V3.2, and the Qwen3 series — deployed on Aliyun infrastructure for 5-30ms latency from China versus 200-400ms from AWS. The cheapest way to run DeepSeek and Qwen at scale.

← All providers

SiliconFlow

CN (Aliyun)8 models

DeepSeek & Qwen — 40-80% lower cost, China-region optimized

SiliconFlow provides optimized inference for leading Chinese open-source models — DeepSeek R1, V3.2, and the Qwen3 series — deployed on Aliyun infrastructure for 5-30ms latency from China versus 200-400ms from AWS. The cheapest way to run DeepSeek and Qwen at scale.

  • 40-80% lower cost vs AWS Bedrock for DeepSeek and Qwen models
  • 5-30ms latency from China vs 200-400ms from US data centers
  • OpenAI-compatible API — drop in your existing code
  • Deployed on Aliyun for China regulatory compliance
ReasoningCodingChinese languageLow costStreaming

Quickstart

from openai import OpenAI

client = OpenAI(
    base_url="https://api.therouter.ai/v1",
    api_key="YOUR_THEROUTER_KEY",
)

response = client.chat.completions.create(
    model="deepseek/deepseek-r1",
    messages=[{"role": "user", "content": "Explain quantum entanglement"}],
    max_tokens=512,
)
print(response.choices[0].message.content)

Models

DeepSeek R1DeepSeek's flagship reasoning model with extended chain-of-thought
DeepSeek V3.2DeepSeek's latest chat model balancing speed and intelligence
DeepSeek V3.1Cost-effective chat model for high-throughput workloads
Qwen3 235BAlibaba's flagship MoE model
Qwen3 32BEfficient mid-size Qwen3 model
Qwen Coder 480BMassive coding-focused model
Qwen Coder 30BLightweight code generation specialist
Qwen3 8BFast and efficient small model

Frequently Asked Questions

Why use TheRouter for SiliconFlow instead of calling SiliconFlow directly?

TheRouter adds automatic failover (SiliconFlow fails → Bedrock takes over), a single API key for all providers, usage analytics, spend controls, and team governance — all without changing your code.

How much cheaper is SiliconFlow compared to AWS Bedrock for DeepSeek models?

SiliconFlow provides DeepSeek R1 and V3.2 at 40-80% lower cost than running the equivalent model through AWS Bedrock. Exact savings depend on your input/output token mix — check the pricing page for current rates.

Does SiliconFlow support streaming and function calling?

Yes. SiliconFlow's API is OpenAI-compatible and supports streaming, function calling (tools), and reasoning content passthrough. The reasoning_content field from DeepSeek R1 is preserved end-to-end.

What region is SiliconFlow deployed in?

SiliconFlow's inference servers are in mainland China. TheRouter's SiliconFlow provider service runs on Aliyun (Alibaba Cloud) for 5-30ms latency from Chinese users, compared to 200-400ms from AWS us-east-2.