Announcement · 2026-05-05
May 2026 Model Wave
Six new models live on TheRouter. Kimi K2.6 ties GPT-5.5 on coding benchmarks. Qwen 3.6 closes a two-generation gap. Gemma 4 ships Apache 2.0. Mistral lands a unified Small 4 and its first TTS.
What shipped
Between late March and late April 2026, four model families dropped meaningful new releases. We verified each one end-to-end against its primary upstream provider and added the following six aliases to the gateway. Existing aliases are unchanged — the rollout is strictly additive (we ran a byte-level snapshot regression gate to prove every pre-existing model behaves identically).
Moonshot — Kimi K2.6
Released 2026-04-20. Kimi K2.6 is Moonshot AI's 1T-parameter Mixture-of-Experts flagship. It ties GPT-5.5 on coding benchmarks at open-weight pricing, and the agent-swarm subsystem now scales to 300 sub-agents and 4,000 coordinated steps (up from 100 / 1,500 in K2.5). Customer alias moonshot/kimi-k2.6.
Alibaba — Qwen 3.6 35B-A3B
Released 2026-04-16. The 3.6 generation is hybrid multimodal with a 262K context window and significantly stronger repo-level coding compared to the 3.x line. We surface the 35B-active-3B MoE variant, which is the version available on SiliconFlow today. Alias qwen/qwen3.6-35b-a3b. (The 27B dense variant depends on DashScope, which is a future-change item.)
Google — Gemma 4 (Apache 2.0)
Released 2026-04-02. Two server-class sizes are live: google/gemma-4-31b (dense, Arena #3 open-weight) and google/gemma-4-26b-moe (Mixture-of-Experts with 4B active per token, Arena #6). Both are Apache 2.0 — the most permissive license Google has shipped a Gemma series under. Both support text + image input.
Mistral — Small 4
Released 2026-03. A single 119B MoE (6B active) model that unifies the capabilities of three earlier specialists: Magistral (reasoning), Pixtral (multimodal), and Devstral (agentic coding). Tools, vision, JSON mode all on by default. Alias mistral/mistral-small-4.
Mistral — Voxtral TTS
Released 2026-03-26. Mistral's first multilingual text-to-speech model. 9 languages, 30+ preset voices, low-latency streaming, and supports custom voice profiles via reference audio. Alias mistral/voxtral-tts. Use the voice field with a slug from GET /v1/audio/voices (e.g. en_paul_neutral).
How to use
The new aliases work via TheRouter's OpenAI-compatible API:
curl https://api.therouter.ai/v1/chat/completions \
-H "Authorization: Bearer $THEROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "moonshot/kimi-k2.6",
"messages": [{"role": "user", "content": "Write a Rust quicksort"}],
"max_tokens": 1024
}'For Voxtral TTS, use the audio endpoint:
curl https://api.therouter.ai/v1/audio/speech \
-H "Authorization: Bearer $THEROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral/voxtral-tts",
"input": "Hello from TheRouter",
"voice": "en_paul_neutral"
}'Pricing (USD per million tokens)
| Model | Input | Output |
|---|---|---|
moonshot/kimi-k2.6 | $0.95 | $4.50 |
qwen/qwen3.6-35b-a3b | $0.20 | $0.80 |
google/gemma-4-31b | $0.30 | $0.50 |
google/gemma-4-26b-moe | $0.20 | $0.40 |
mistral/mistral-small-4 | $0.20 | $0.60 |
mistral/voxtral-tts | $12 per 1M characters | |
Existing models unchanged
Every model in our catalog from before this rollout behaves identically. We ship a snapshot regression test that asserts byte-level equivalence on the 181 pre-existing aliases (every observable field — pricing, modality, capabilities, routes — must match). The test passes.
Coming next
Two follow-ups are tracked but not bundled here: qwen/qwen3.6-27b (the dense 27B variant, awaiting DashScope provider integration) and google/gemma-4-e4b (effective-4B, on-device target — provider hosting TBD). We'll surface these in a future model wave once a viable upstream lands.
Try the new aliases on dashboard.therouter.ai or via API. Feedback welcome at hello@therouter.ai.