Name: MiniMax M2.7 API
Brand: MiniMax

MiniMax M2.7

MiniMax M2.7 isn't just a smarter model, it's one that participated in its own creation.

What MiniMax M2.7 Is

MiniMax M2.7 is the latest flagship text model, purpose-built for real-world software engineering and complex production workloads. It stands out through its core architecture focused on recursive self-improvement and multi-agent collaboration, delivering exceptional performance in software engineering, debugging, log analysis, code generation, and long-form document creation.

Unlike previous models that excelled mainly at polyglot coding and multi-step reasoning in controlled benchmarks, M2.7 was specifically engineered for live production environments. It brings strong causal reasoning capabilities, the kind needed to understand, diagnose, and fix issues inside actual running systems, not just sandbox tests.

Key Specs

Model ID: minimax/minimax-m2.7
Context window: 204,800 tokens
Claimed output speed: About 60 tps standard, about 100 tps high-speed

MiniMax M2.7 Pricing

Input: $0.39 per 1M tokens
Output: $1.56 per 1M tokens

Benchmark Results That Actually Matter

Most benchmark comparisons tell you how a model performs on carefully curated academic tests. The interesting thing about M2.7's numbers is where they come from: production-grade scaffolds, terminal-based engineering challenges, and real document-editing workflows.

What M2.7 Is Actually Built For

Understanding where M2.7 excels, and where it trades off, makes a real difference in whether it's the right model for a given workflow. It made deliberate design choices, optimizing agentic performance even at a small cost to narrow domain precision in areas like specialized medicine and finance.

Software Engineering

Live debugging, root cause analysis, log reading, code security review, and multi-file refactors. Reduction of production incident recovery time to under three minutes has been documented in SRE contexts.

Multi-Agent Coordination

Plans, executes, and refines tasks across dynamic environments through multi-agent collaboration. Can orchestrate sub-agents with distinct roles and communication protocols within a single harness.

Office Document Generation

End-to-end creation and editing of Word, Excel, and PowerPoint files. Achieves 97% skill adherence on complex multi-round office tasks — the highest GDPval-AA ELO score among open-source-accessible models.

Financial Modeling

Handles structured financial workflows including multi-step spreadsheet logic, data aggregation pipelines, and report generation across financial datasets in production environments.

Long-Context Reasoning

204,800-token context window with full automatic cache support, no manual configuration needed. Prompt caching is built-in, which has meaningful cost implications for repeated or system-prompt-heavy workflows.

High-Speed Variant

The M2.7-highspeed variant delivers identical output quality at approximately 100 TPS, roughly 3x faster than the base variant, for latency-sensitive applications and high-throughput inference pipelines.

How It Stacks Up Against Alternatives

M2.7 is not a drop-in replacement for every use case. Where it competes on coding and agent tasks, it's genuinely at the frontier tier. Where it falls short is general knowledge depth and some specialized vertical domains where Claude Opus 4.6 and GPT-5 still have an edge.

Criterion	MiniMax M2.7	Claude Opus 4.6	GPT-5
SWE-Pro (Coding)	56.2%	~58% (est.)	~57% (est.)
Input token price	$0.30/M	~$15/M	~$10/M
Output token price	$1.20/M	~$75/M	~$30/M
Speed (TPS)	44–100	~30–50	~40–80
Context window	204K	200K	128K
Open weights available	✓ Available	✗ No	✗ No
Self-evolving architecture	✓ Yes	✗ No	✗ No
Production agentic use	Primary focus	Strong	Strong
Best-in-class general knowledge	Not primary	Yes	Yes

Who Should Use M2.7 via API?

The model's design choices, heavy agentic tuning, long context, tool-calling precision, and low per-token cost, point toward a specific kind of user.

`// 01 DevOps & SRE Teams`

If you're building incident response agents that correlate monitoring metrics with code repositories, M2.7's sub-three-minute production recovery documentation makes it worth evaluating against heavier, pricier options.

`// 02 ML Research Infrastructure`

The self-evolution loop was designed for RL research workflows. Teams running experiment pipelines who want an AI that can monitor, debug, and optimize its own scaffolds will find M2.7 purpose-built for this.

`// 03 Document Automation Pipelines`

Organizations generating large volumes of Word, Excel, and PowerPoint output, financial reports, legal documents, data summaries, benefit from M2.7's top-ranked office task ELO without the overhead of closed-source pricing.

`// 04 Startups Replacing Frontier API Costs`

If your product runs coding, document processing, or agentic tasks on Claude Opus 4.6 or GPT-5, M2.7 is the first realistic alternative where the cost-to-performance ratio justifies a migration evaluation.

`// 05 High-Throughput Research Systems`

With 100 TPS on the highspeed variant, workloads that need fast parallel inference — large-scale data processing, evaluation pipelines, multi-agent simulations — run materially faster and cheaper than most alternatives.

`// 06 Agent Framework Developers`

M2.7 was designed as a drop-in backend for harnesses like Claude Code, Kilo Code, and OpenClaw. Its 75.8% tool-calling accuracy means fewer brittle tool invocations and more reliable multi-step chains in production.

Real Tasks, Real Results

Benchmarks give you numbers. These documented examples give you a better sense of what M2.7 actually does when given a production problem with no hand-holding:

Multi-agent Game Development

M2.7 was given a brief to build a six-player "Who Am I?" party game, a lead agent and five players, each with unique roles and behavioral constraints. Without any human intervention, the model wrote the server-side game logic, the client-facing web page, configured inter-agent communication, and successfully ran the game from start to finish. The entire codebase was produced in a single agentic session.

PostgreSQL Production Incident

Given logs and a database configuration from a degraded production system, M2.7 correctly identified the root cause of a performance drop and proposed a fix using PostgreSQL's CONCURRENTLY syntax, a detail that matters specifically because standard index operations lock the table in production. The model understood the non-blocking requirement without being explicitly told, which is the kind of contextual judgment that separates adequate from production-ready reasoning.

Autonomous Kaggle Competition

Across three 24-hour autonomous evolution trials, M2.7 participated in a Kaggle-style ML competition without human guidance. It built training pipelines, monitored results, and iterated on modeling decisions independently. The best single run produced 9 gold medals, 5 silver, and 1 bronze, placing M2.7 at a 66.6% average medal rate, narrowly behind Opus 4.6 (75.7%) and GPT-5.4 (71.2%), with no human researcher in the loop.

What You Should Know Before Committing

M2.7 is one of the most compelling API models released in early 2026, but it's not perfect for every team. Here's what the data and documentation are honest about.

Where It Excels

+Agentic and tool-calling workflows — noticeably above average

+Cost efficiency at frontier performance tier

+Office document automation at the highest published ELO

+Long context handling with automatic prompt caching

+Instruction following on complex, multi-step prompts

+Rapid release cadence — five major versions in under a year

Known Limitations

−Small dip in specialized medical, financial, and legal precision vs M2.5

−Notably verbose — generates ~87M tokens on Intelligence Index, far above average

−TTFT at 2.27s is above the median for comparable reasoning models

−Text-only — no image or multimodal input support

−Third-party ecosystem smaller than Claude or GPT integrations

−SWE-Pro and VIBE-Pro are internal benchmarks lacking external validation

‍

Example H2

Try it now

What MiniMax M2.7 Is

Key Specs

Model ID: minimax/minimax-m2.7
Context window: 204,800 tokens
Claimed output speed: About 60 tps standard, about 100 tps high-speed

MiniMax M2.7 Pricing

Input: $0.39 per 1M tokens
Output: $1.56 per 1M tokens

Benchmark Results That Actually Matter

What M2.7 Is Actually Built For

Software Engineering

Multi-Agent Coordination

Plans, executes, and refines tasks across dynamic environments through multi-agent collaboration. Can orchestrate sub-agents with distinct roles and communication protocols within a single harness.

Office Document Generation

Financial Modeling

Handles structured financial workflows including multi-step spreadsheet logic, data aggregation pipelines, and report generation across financial datasets in production environments.

Long-Context Reasoning

High-Speed Variant

How It Stacks Up Against Alternatives

Criterion	MiniMax M2.7	Claude Opus 4.6	GPT-5
SWE-Pro (Coding)	56.2%	~58% (est.)	~57% (est.)
Input token price	$0.30/M	~$15/M	~$10/M
Output token price	$1.20/M	~$75/M	~$30/M
Speed (TPS)	44–100	~30–50	~40–80
Context window	204K	200K	128K
Open weights available	✓ Available	✗ No	✗ No
Self-evolving architecture	✓ Yes	✗ No	✗ No
Production agentic use	Primary focus	Strong	Strong
Best-in-class general knowledge	Not primary	Yes	Yes

Who Should Use M2.7 via API?

The model's design choices, heavy agentic tuning, long context, tool-calling precision, and low per-token cost, point toward a specific kind of user.

`// 01 DevOps & SRE Teams`

`// 02 ML Research Infrastructure`

`// 03 Document Automation Pipelines`

`// 04 Startups Replacing Frontier API Costs`

`// 05 High-Throughput Research Systems`

`// 06 Agent Framework Developers`

Real Tasks, Real Results

Benchmarks give you numbers. These documented examples give you a better sense of what M2.7 actually does when given a production problem with no hand-holding:

Multi-agent Game Development

PostgreSQL Production Incident

Autonomous Kaggle Competition

What You Should Know Before Committing

M2.7 is one of the most compelling API models released in early 2026, but it's not perfect for every team. Here's what the data and documentation are honest about.

Where It Excels

+Agentic and tool-calling workflows — noticeably above average

+Cost efficiency at frontier performance tier

+Office document automation at the highest published ELO

+Long context handling with automatic prompt caching

+Instruction following on complex, multi-step prompts

+Rapid release cadence — five major versions in under a year

Known Limitations

−Small dip in specialized medical, financial, and legal precision vs M2.5

−Notably verbose — generates ~87M tokens on Intelligence Index, far above average

−TTFT at 2.27s is above the median for comparable reasoning models

−Text-only — no image or multimodal input support

−Third-party ecosystem smaller than Claude or GPT integrations

−SWE-Pro and VIBE-Pro are internal benchmarks lacking external validation

‍

Try it now

MiniMax M2.7

MiniMax M2.7

What MiniMax M2.7 Is

Key Specs

MiniMax M2.7 Pricing

Benchmark Results That Actually Matter

What M2.7 Is Actually Built For

Software Engineering

Multi-Agent Coordination

Office Document Generation

Financial Modeling

Long-Context Reasoning

High-Speed Variant

How It Stacks Up Against Alternatives

Who Should Use M2.7 via API?

// 01 DevOps & SRE Teams

// 02 ML Research Infrastructure

// 03 Document Automation Pipelines

// 04 Startups Replacing Frontier API Costs

// 05 High-Throughput Research Systems

// 06 Agent Framework Developers

Real Tasks, Real Results

Multi-agent Game Development

PostgreSQL Production Incident

Autonomous Kaggle Competition

What You Should Know Before Committing

What MiniMax M2.7 Is

Key Specs

MiniMax M2.7 Pricing

Benchmark Results That Actually Matter

What M2.7 Is Actually Built For

Software Engineering

Multi-Agent Coordination

Office Document Generation

Financial Modeling

Long-Context Reasoning

High-Speed Variant

How It Stacks Up Against Alternatives

Who Should Use M2.7 via API?

// 01 DevOps & SRE Teams

// 02 ML Research Infrastructure

// 03 Document Automation Pipelines

// 04 Startups Replacing Frontier API Costs

// 05 High-Throughput Research Systems

// 06 Agent Framework Developers

Real Tasks, Real Results

Multi-agent Game Development

PostgreSQL Production Incident

Autonomous Kaggle Competition

What You Should Know Before Committing

500+ AI Models

The Best Growth Choice for Enterprise

Our Clients' Voices

`// 01 DevOps & SRE Teams`

`// 02 ML Research Infrastructure`

`// 03 Document Automation Pipelines`

`// 04 Startups Replacing Frontier API Costs`

`// 05 High-Throughput Research Systems`

`// 06 Agent Framework Developers`

`// 01 DevOps & SRE Teams`

`// 02 ML Research Infrastructure`

`// 03 Document Automation Pipelines`

`// 04 Startups Replacing Frontier API Costs`

`// 05 High-Throughput Research Systems`

`// 06 Agent Framework Developers`

The Best Growth Choice
for Enterprise