April 24, 2026·New Models·中文版本 →

DeepSeek V4 Now Available on TheRouter — Direct API Integration

DeepSeek released V4 Flash and V4 Pro today — their most powerful open-source models to date. Both are already live on TheRouter with day-one support.

V4 Flash — Best Value for Everyday Tasks

284B MoE, 13B active — Mixture of Experts architecture with only 13B parameters active per forward pass, keeping inference fast and cost low.
1M context, 384K max output — process entire codebases or long documents in a single request with massive output capacity.
Default thinking mode — built-in chain-of-thought reasoning enabled by default for better accuracy.
$0.14 / $0.28 per MTok (input/output) — among the most affordable reasoning models available.

V4 Pro — Complex Reasoning Powerhouse

1.6T MoE, 49B active — the largest open-source MoE model, approaching Claude Opus 4.6 non-thinking level performance.
1M context, 384K max output — same generous context and output limits as V4 Flash.
$1.74 / $3.48 per MTok (input/output) — competitive pricing for a model at this capability level.

Benchmarks

Benchmark	V4 Pro	V4 Flash	Claude Opus 4.6
SWE-bench Verified	80.6%	79.0%	80.8%
LiveCodeBench	93.5	—	—
Codeforces Rating	3206	—	—

V4 Pro leads on LiveCodeBench (93.5) and achieves the highest Codeforces rating (3206) among all models. On SWE-bench Verified, it matches Claude Opus 4.6 within 0.2%.

Architecture

Hybrid Attention Architecture — combines efficient attention mechanisms for handling both short and ultra-long sequences.
Engram conditional memory — enables efficient processing of 1M context windows without proportional compute scaling.
MoE with low active params — keeps inference costs dramatically lower than dense models of equivalent total parameter count.

Pricing

Model	Input	Output	Context
V4 Flash	$0.14/MTok	$0.28/MTok	1M
V4 Pro	$1.74/MTok	$3.48/MTok	1M

V4 Flash is one of the most cost-effective reasoning models available. V4 Pro offers frontier-level coding at a fraction of closed-source pricing.

How to Use It

Use the standard model names — TheRouter handles routing automatically:

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THE_ROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v4-flash",
    "messages": [{"role": "user", "content": "Explain the MoE architecture"}],
    "max_tokens": 4096
  }'

For V4 Pro, use deepseek/deepseek-v4-pro. Both models are available on the Global endpoint (api.therouter.ai) and the China endpoint (airouter-api.mizone.me).

Open Source

Both V4 Flash and V4 Pro are released under the Apache 2.0 license with full model weights available on Hugging Face. You can self-host, fine-tune, or use them commercially without restrictions.

Getting Started

Already on TheRouter? Just set the model to deepseek/deepseek-v4-flash or deepseek/deepseek-v4-pro — no other changes needed.

Start for free Quickstart guide DeepSeek provider

Questions? Reach out on GitHub.