A Conversation with MiniMax's Junjie Yan: M3, Project 10X, the 10T Model, and the Endgame of Intelligence

June 21, 2026

🚥 This week's Crossing was recorded live at the MiniMax Dev Meetup, where Koji sat down with MiniMax CEO Junjie Yan (IO), Jiayuan Zhang of Multica, He Tao of DeerFlow, and Yu Yang, AI lead at a listed financial company.

We explored the next phase of AI coding, agents, and production engineering. The highlight is IO's share — one of his rare public appearances lately:

Key breakthroughs and assessment of M3

Conviction and roadmap for training a 10T-scale model

The real magnitude of the China-US model gap

A shifted philosophy on data, and why they launched the 10X expert collaboration program

AI coding at an inflection point: engineering systems versus throwaway code

The actual relationship between "base models vs. agents"

How MiniMax is placing its next bets and making trade-offs

The other three guests brought compelling perspectives from their own angles: Jiayuan Zhang on combining multiple models/agents to balance cost and quality; He Tao on how engineering's core is long-term maintenance and delivery systems, not one-shot task completion; and Yu Yang on how AI's value in verticals like finance lies more in "turning information into executable decision paths," as well as lowering barriers, providing assistance, and offering companionship amid rapid change.

This episode feels more like a time capsule from mid-2026: representatives from a major model company, developers, open-source projects, and vertical applications all on one stage, sharing real-time observations, stories, and insights — making it worth revisiting again and again.

🎬 Our video podcast is now live on Koji Yang Yuancheng's Weixin Channels, Douyin, Xiaohongshu, Bilibili, YouTube, and other platforms.

📒 The transcript has been published on the Crossing WeChat official account.

🟢 01:41 Key breakthroughs and assessment of M3

When measuring whether a generation of models actually delivers, IO watches for "a relatively objective metric" — not benchmark scores. What is it?

Last May Day holiday, the team first got something running on M1 — a moment somewhat like "slumping back in the chair, chills down the spine"

M2.5 was originally expected to burn through 1 trillion tokens per day; it ended up hitting 10 trillion, 10x the target

🟢 12:09 Base models vs. agents: what's the actual relationship?

"At this time last year, I couldn't have imagined what today's models would look like"

Without Claude Code, a certain model might never have taken off; without GPT-5.5, Codex wouldn't have either.

Models keep getting stronger, yet IO holds firm to one "premise."

🟢 14:12 Conviction to train a 10T model

A 10T model must be trained successfully. What's the biggest bottleneck?

"AI is already a massive industry, just like semiconductors."

Why can this only be done "generation by generation," with no rushing?

Once extrapolation fails, the model becomes a blind box.

🟢 15:54 China-US gap: 10x means exactly two generations

US models are "basically 10x larger," and 10x means exactly two generations.

Every domestic player needs to nail 3T first, then 10T — but a 10T model requires 200T of data, and "there isn't that much data in the entire world."

On one hand, "we're improving the fastest"; on the other, "fairly anxious" — why?

🟢 17:53 AI coding inflection point: engineering, or "disposable code"

"Nobody ever said vibe engineering, but writing code has always been engineering."

Once everyone can vibe code, everyone becomes a "product manager."

He Tao's hot take: most annoying thing is hearing "the agent did it, don't blame me" — you submitted it under your account, whose responsibility is it really?

One person modifying a dozen repos, submitting a massive PR — "looks correct, but nobody dares to deploy it" — where's the problem?

🟢 27:12 Data philosophy shift: they're now hiring nuclear physicists

A year ago, data meant labeling; now MiniMax is recruiting economists, philosophers, even nuclear physicists.

When working on coding, they discovered: development engineers understood "what makes good code" better than the algorithm team — what conclusion did this lead to?

Why does Anthropic employ nuclear physicists?

What gap is MiniMax trying to close with its 10X expert collaboration launch?

🟢 30:30 MiniMax's next phase: what to bet on?

AI is a black box; even the people building models don't fully understand it

What IO cares about most is when we can "use AI to help humans understand AI."

The hippocampus in the brain turns out to bear striking resemblance to a certain mechanism in model training.

A year ago we didn't grasp why "alignment" mattered; now we're increasingly certain — why?

Subscribe to Crossing: 🚦 We track the industry shifts and entrepreneurial opportunities brought by the new wave of AI technology.

🚦 Crossing is Steve Jobs's metaphor for Apple — standing at the intersection of technology and liberal arts, where great products are born. AI is transforming every industry. We seek out, interview, and bring together a new generation of AI founders and proactive actors in the AI era. Together with them, we explore and embrace the new changes and new possibilities.

👦🏻 Host Koji: I founded Crossing, launched AI Hacker House — a community space for the new generation of AI entrepreneurs — and serve as Venture Partner at ZhenFund. I believe technology, especially AI, represents the greatest value-creation opportunity of our generation. Koji on Jike, Koji's website