Provider profile: MiniMaxMiniMax builds frontier AI models optimized for agentic workflows. M2.7 delivers strong reasoning, coding, and multi-tool orchestration with 204K context. Highspeed variants offer ~100 tokens/sec for latency-sensitive applications.

← All providers

MiniMax

Global7 models

Agent-native frontier models with 200K context and interleaved reasoning

MiniMax builds frontier AI models optimized for agentic workflows. M2.7 delivers strong reasoning, coding, and multi-tool orchestration with 204K context. Highspeed variants offer ~100 tokens/sec for latency-sensitive applications.

  • M2.7 flagship model with 204K context window and interleaved thinking chains for deep reasoning
  • Agent-native architecture — built for multi-tool orchestration, task decomposition, and long-horizon planning
  • Highspeed variants (~100 tokens/sec) for latency-sensitive coding and real-time agent applications
  • Full-stack multimodal platform — text, speech, video, music, and image generation from a single provider
  • Founded 2022, serving 236M+ individual users and 214K+ enterprise clients across 200+ countries
ReasoningCodingTool useAgents

Quickstart

from openai import OpenAI

client = OpenAI(
    base_url="https://api.therouter.ai/v1",
    api_key="YOUR_THEROUTER_KEY",
)

response = client.chat.completions.create(
    model="minimax/m2.7",
    messages=[{"role": "user", "content": "Build a React component for a data table with sorting and filtering"}],
    max_tokens=2048,
)
print(response.choices[0].message.content)

China users: replace api.therouter.ai with api.therouter.com.cn for lower latency.

Models

MiniMax M2MiniMax M2 is a MoE model blending frontier-level intelligence with efficient active parameters. Engineered for AI agents with strong reasoning, coding, and multilingual performance. Ideal for general-purpose chat/coding, tool use, and high-throughput inference.MiniMax M2.1MiniMax M2.1 is an open-weight model focused on coding, tool use, and long-horizon task planning. Trained with emphasis on practical benchmarks covering front-end, backend, and workflow automation. General-purpose backbone for agent-based applications with reliable instruction following.MiniMax M2.1 HighspeedMiniMax M2.1 Highspeed variant with ~100 tokens/sec output speed. Same capabilities as M2.1 at 2x cost for latency-sensitive applications.MiniMax M2.5MiniMax M2.5 is an agent-native frontier model trained to reason efficiently, decompose tasks optimally, and complete complex workflows under real-world constraints. Combines high inference throughput with RL-focused token-efficient reasoning. Suited for full-stack software projects, research workflows, long-horizon planning, and multi-tool orchestration.MiniMax M2.5 HighspeedMiniMax M2.5 Highspeed variant with ~100 tokens/sec output speed. Same capabilities as M2.5 at 2x cost for latency-sensitive applications.MiniMax M2.7MiniMax M2.7 is a frontier reasoning model with interleaved thinking chains and multi-tool orchestration. 204K context with strong performance on agentic workflows, coding, and complex multi-step reasoning tasks.MiniMax M2.7 HighspeedMiniMax M2.7 Highspeed variant with ~100 tokens/sec output speed. Same capabilities as M2.7 at 2x cost for latency-sensitive applications.

Frequently Asked Questions

What is MiniMax and what makes it different?

MiniMax is a global AI company founded in 2022, serving 236 million users across 200+ countries. Unlike most LLM providers that focus solely on text, MiniMax offers a full-stack multimodal platform spanning text (M2 series), speech (Speech 2.8), video (Hailuo 2.3), music (Music 2.6), and image generation — all from a single provider. Their M2.7 model features interleaved thinking chains for agent-native reasoning and complex workflow automation.

Which MiniMax models are available on TheRouter?

TheRouter provides 7 MiniMax models: M2.7 (flagship), M2.5, M2.1, and M2, plus Highspeed variants of M2.7, M2.5, and M2.1. All models have 204K context windows and support function calling (tools). Highspeed variants deliver ~100 tokens/sec output speed at 2x cost, ideal for latency-sensitive applications.

What is the difference between standard and Highspeed MiniMax models?

Standard variants (e.g., minimax/m2.7) output at ~60 tokens/sec and are cost-optimized for batch and background tasks. Highspeed variants (e.g., minimax/m2.7-highspeed) deliver ~100 tokens/sec at 2x the price, designed for real-time coding assistants, interactive agents, and latency-sensitive UIs where speed matters more than cost.

Why route MiniMax through TheRouter instead of calling directly?

TheRouter adds automatic failover (MiniMax direct API → AWS Bedrock → SiliconFlow), unified billing across all providers, usage analytics, spend controls, and team governance — all through the same OpenAI-compatible API. If MiniMax experiences downtime, your traffic seamlessly routes to backup providers.

Does MiniMax support function calling and streaming?

Yes. All MiniMax models on TheRouter support streaming (SSE), function calling via the standard tools parameter, response_format for JSON output, and stop sequences. The API is fully OpenAI-compatible — no code changes needed.