Announcement · 2026-05-05

May 2026 Model Wave

Six new models live on TheRouter. Kimi K2.6 ties GPT-5.5 on coding benchmarks. Qwen 3.6 closes a two-generation gap. Gemma 4 ships Apache 2.0. Mistral lands a unified Small 4 and its first TTS.

What shipped

Between late March and late April 2026, four model families dropped meaningful new releases. We verified each one end-to-end against its primary upstream provider and added the following six aliases to the gateway. Existing aliases are unchanged — the rollout is strictly additive (we ran a byte-level snapshot regression gate to prove every pre-existing model behaves identically).

Moonshot — Kimi K2.6

Released 2026-04-20. Kimi K2.6 is Moonshot AI's 1T-parameter Mixture-of-Experts flagship. It ties GPT-5.5 on coding benchmarks at open-weight pricing, and the agent-swarm subsystem now scales to 300 sub-agents and 4,000 coordinated steps (up from 100 / 1,500 in K2.5). Customer alias moonshot/kimi-k2.6.

Alibaba — Qwen 3.6 35B-A3B

Released 2026-04-16. The 3.6 generation is hybrid multimodal with a 262K context window and significantly stronger repo-level coding compared to the 3.x line. We surface the 35B-active-3B MoE variant, which is the version available on SiliconFlow today. Alias qwen/qwen3.6-35b-a3b. (The 27B dense variant depends on DashScope, which is a future-change item.)

Google — Gemma 4 (Apache 2.0)

Released 2026-04-02. Two server-class sizes are live: google/gemma-4-31b (dense, Arena #3 open-weight) and google/gemma-4-26b-moe (Mixture-of-Experts with 4B active per token, Arena #6). Both are Apache 2.0 — the most permissive license Google has shipped a Gemma series under. Both support text + image input.

Mistral — Small 4

Released 2026-03. A single 119B MoE (6B active) model that unifies the capabilities of three earlier specialists: Magistral (reasoning), Pixtral (multimodal), and Devstral (agentic coding). Tools, vision, JSON mode all on by default. Alias mistral/mistral-small-4.

Mistral — Voxtral TTS

Released 2026-03-26. Mistral's first multilingual text-to-speech model. 9 languages, 30+ preset voices, low-latency streaming, and supports custom voice profiles via reference audio. Alias mistral/voxtral-tts. Use the voice field with a slug from GET /v1/audio/voices (e.g. en_paul_neutral).

How to use

The new aliases work via TheRouter's OpenAI-compatible API:

curl https://api.therouter.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshot/kimi-k2.6",
    "messages": [{"role": "user", "content": "Write a Rust quicksort"}],
    "max_tokens": 1024
  }'

For Voxtral TTS, use the audio endpoint:

curl https://api.therouter.ai/v1/audio/speech \
  -H "Authorization: Bearer $THEROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral/voxtral-tts",
    "input": "Hello from TheRouter",
    "voice": "en_paul_neutral"
  }'

Pricing (USD per million tokens)

Model	Input	Output
`moonshot/kimi-k2.6`	$0.95	$4.50
`qwen/qwen3.6-35b-a3b`	$0.20	$0.80
`google/gemma-4-31b`	$0.30	$0.50
`google/gemma-4-26b-moe`	$0.20	$0.40
`mistral/mistral-small-4`	$0.20	$0.60
`mistral/voxtral-tts`	$12 per 1M characters

Existing models unchanged

Every model in our catalog from before this rollout behaves identically. We ship a snapshot regression test that asserts byte-level equivalence on the 181 pre-existing aliases (every observable field — pricing, modality, capabilities, routes — must match). The test passes.

Coming next

Two follow-ups are tracked but not bundled here: qwen/qwen3.6-27b (the dense 27B variant, awaiting DashScope provider integration) and google/gemma-4-e4b (effective-4B, on-device target — provider hosting TBD). We'll surface these in a future model wave once a viable upstream lands.

Try the new aliases on dashboard.therouter.ai or via API. Feedback welcome at hello@therouter.ai.