Yunqi Capital Doers | MiniMax's Junjie Yan and Yonghao Luo's Four-Hour Deep Dive on "Hard but Right"

云启资本·December 15, 2025·26·1

Before consensus emerges, choose the hardest path.

In Yunqi Capital's investment history, there are projects that began with a connection formed before there was a product, before there was consensus, before there was even a real company. MiniMax was one of them.

In the spring of 2021, Yunqi's investment team met Junjie Yan, who had not yet officially founded MiniMax. There was no "foundation model startup wave" at the time; multimodality and AGI were far from industry buzzwords. But Yan was already working through a question in his mind: if foundation models proved viable, could AI become a truly general-purpose, scalable product?

A few months later, Yunqi participated in MiniMax's angel round, becoming one of the company's earliest institutional investors. (Click here for a deep conversation between Yunqi founding partner Chengyu Mao and Junjie Yan)

More than four years have passed. The industry environment, technical trajectory, and competitive landscape have all changed dramatically. MiniMax has evolved from a fledgling team into an AI unicorn firmly committed to multimodality, making steady progress and achieving standout results across language, audio, and video. Looking back, this path was never the "easy" one — at many stages, it wasn't even understood — but it always pointed toward the same conviction: do the hard, right thing.

Recently, Junjie Yan appeared on Yonghao Luo's Crossroads, a four-hour deep conversation in which he systematically reflected on the pivotal choices that took him from research to industry to entrepreneurship, and on his evolving thinking about technical direction, talent, and organizational form.

This conversation reminds us once again: what truly matters is never chasing trends, but the ability to keep validating your direction over time. Below are excerpts from the interview.

Originally published by Tencent Technology

Original title: "MiniMax's Junjie Yan: China Has No Shortage of AI Talent, But We Need to Do the Hard, Right Thing"

Recently, Junjie Yan, founder and CEO of MiniMax (Xiyu Technology), appeared on Yonghao Luo's Crossroads for a four-hour interview.

During the program, Yan and Luo covered considerable ground — from why MiniMax bet on multimodality, to the company's positioning and value, to core competitiveness in the AI era and talent differentiation. Through the dual lenses of observer and practitioner, they dissected current trends in AI and where the industry is headed.

Born in 1989, Yan holds a PhD from the Chinese Academy of Sciences. He interned at Baidu's AI Research Institute, where he caught an early glimpse of AI's future form in the lab. He later joined SenseTime, leading a team of more than 700 to crack facial recognition algorithms and reaching the top of the industry. When AIGC technology began to explode, he resigned as SenseTime's VP to start MiniMax. Public data now puts the company's valuation above $4 billion.

According to global benchmark Artificial Analysis, MiniMax-M2 ranks fifth worldwide among large language models with a total score of 61, and first among open-source models — placing it in the top tier globally.

MiniMax-M2 is a lightweight model with 10 billion active parameters (230 billion total), built on an efficient MoE (Mixture of Experts) architecture that achieves low computational cost — just 8% of Claude 4.5's — with significant price advantages. In just over a month since launch, token usage has surpassed 1 trillion.

Notably, MiniMax has been a multimodal AI unicorn from day one. It now leads globally in language models, video, and audio (ranking first in audio, second in video), while also incubating AI-native applications like Talkie and Hailuo that have gained traction at home and abroad — making it a distinctive presence in the foundation model space.

As competition has intensified, AI foundation models have moved beyond pure compute races. Dimensionality, scenario versatility, hardware synergy, and deployment costs have all become critical factors. Faced with this shift, many foundation model startups have pivoted or shut down. MiniMax, however, has stayed the course on multimodality — a direction Yan and his team set from the company's founding.

Yan believes true AGI must involve multimodal input and output. The challenge is simply that it's extremely difficult to achieve, so the approach is to master each modality first, then integrate at the right moment.

"We've basically worked through each modality now. In the coming months we'll explore multimodal fusion. I think we're a relatively rare company globally, and one that has a real shot at making AGI happen." Yan sounded confident. Doing the "hard, right" thing is MiniMax's philosophy for survival.

Key takeaways from this interview:

Multimodal development isn't about "lack of focus" — it's a necessary path to AGI. MiniMax now has the foundational conditions in place and will explore full multimodal fusion.
In the foundation model era, the real product is the model itself, transforming creativity and productivity. As models evolve, future application forms will serve everyone, realizing MiniMax's vision of "Intelligence with Everyone."
Technology-driven development is one of MiniMax's core competitive strengths. Building on this, the company focuses on internationalization and direct customer service, achieving commercialization early — which has made fundraising smooth.
For Chinese AI companies, copy-and-paste won't work;本土 innovation is essential.
In the AI era, the core competitive advantage is imagination. At future inflection points in AI competition, the challenge for Chinese companies is how to leverage their environment and talent advantages to achieve technical and narrative leadership.

Foreseeing Scaling Law:

The pivotal moment that drove Yan to start a company

Yonghao Luo: You said you were going to be a teacher. How did you end up starting a company?

Junjie Yan: I majored in math at Southeast University for undergrad, then did a combined master's-PhD at the Chinese Academy of Sciences' Institute of Automation. Back then it was mostly writing papers and doing research, and I felt the field didn't have great job prospects — probably end up as a teacher or something.

The turning point was 2014, when deep learning emerged. I was interning at Baidu Research, which had the largest GPU cluster in China, an excellent environment. I probably used about a third of their GPUs for experiments, and that made me realize AI could actually create real value.

Yonghao Luo: So you actually saw things ahead of time — the public only saw the breakthrough with GPT-3.5.

Junjie Yan: Honestly I think it's a bit regrettable for China. The CEO of Anthropic was actually at Baidu working on speech at that time, and had already discovered Scaling Law — the core principle underlying large models. That happened at a Chinese company. [That China didn't produce something like GPT-3.5] is unfortunate.

Yonghao Luo: Later during your postdoc you interned at SenseTime, and rose to VP and CTO pretty quickly. How did you pull that off?

Junjie Yan: I think mainly by taking on important responsibilities. In about a year and a half, I took facial recognition technology to number one in the industry.

Through that process, I realized two things. First, you have to make trade-offs — focus on longer-term, transformative technologies rather than patching things up. Second, I was reminded again of Scaling Law's importance. Previously each model could only do one thing, fit one scenario. But constrained by resources and manpower, we chose to build a relatively all-purpose model to handle everything. The results were excellent, and this became one of the drivers for my later entrepreneurship.

Multimodal development is the only path to AGI;

international leadership potential stands out even more

Yonghao Luo: You released a new round of models in October. "Brag" a bit — I mean, the real results.

Junjie Yan: (Laughs) We reached international leadership in audio roughly two years ago, and in video about a year ago, but language models hadn't yet reached international leadership. This time the language model has gained considerable international recognition, with many developers using it for agents and coding — I think that's very significant progress.

Objectively speaking, we're almost the only Chinese company that has achieved international leadership in language models, video, and audio.

Yonghao Luo: Including music, right? Also in the top tier now.

Junjie Yan: Yes, one of the top three. But actually this is very difficult [doing full multimodality], and outsiders find it strange — a startup doing everything, isn't that "unfocused"?

Yonghao Luo: Do a lot of people ask that, about being unfocused?

Junjie Yan: Yes. Actually we thought it through clearly from the start: true AGI must involve multimodal input and output; it's just extremely difficult to implement. Our thinking was to master each modality first, then integrate at the right time. OpenAI's Sora 2, for example, is typical multimodal fusion and has been enormously successful.

I think we now basically have the foundation — each modality already has data and usage scenarios. I hope to integrate them into the next-generation model in the coming months. Globally, I think we're among the few companies with a real shot at making this happen.

Yonghao Luo: But compared to foreign competitors, you're spending who knows how many times less money.

Junjie Yan: Yes, and this relates to inherent Chinese advantages. I think the core point is that in the US, there are really only four companies that can build large models — Google, OpenAI, Anthropic, and xAI. These companies may be valued at 100 times Chinese companies, but they're only 5% ahead technically, while spending maybe 50 to 100 times more.

That said, why can Chinese companies spend one-fiftieth of their money and produce results with maybe only a 5% gap — and I think that gap has actually been narrowing over the past year? The core reason, I believe, is that Chinese talent is excellent.

Chinese soil will cultivate

top-tier homegrown AI-native talent

Junjie Yan: China has many excellent young people right now. Including our company, there's still a mechanism for bringing excellent young people together — that's relatively critical.

Of course, I've been thinking that if a few technical prodigies emerge from China's young people, it could be a breakthrough point for China's AI sector. I think that might happen within two or three years.

Yonghao Luo: Yes, people like that will definitely appear.

Junjie Yan: And such talent will definitely be people who grew up in China's本土 environment, not returnees from the US.

Yonghao Luo: Most of your talent is本土? No overseas returnees?

Junjie Yan: We have returnees, but the people who truly play key roles in the company are basically from本土 — this is their first job.

Yonghao Luo: That's quite remarkable. By my understanding, in AI we're learners and followers, yet we can manage without relying on overseas Chinese — Wenfeng Liang's DeepSeek doesn't have a single returnee. What's changed in recent years?

Junjie Yan: I think AI isn't mysticism; it can be deconstructed with first principles. So if you do each technical part well and keep accumulating — algorithm design, data pipeline construction, training efficiency optimization, etc. — you'll see good results. That's the first point.

Second, objectively speaking, China's compute gap with the US is relatively large. This forces Chinese companies to innovate to achieve what American companies achieve. On broad principles everyone may agree, but Chinese technology has innovations in different modules — it's not quite the same [as the American approach].

Poaching at astronomical salaries vs.本土 innovation:

China and America's AI paths diverge

Yonghao Luo: A while back Meta's Mark Zuckerberg was poaching people with salaries of hundreds of millions of dollars, and everyone thought that was high. But if you factor in these people's experience at OpenAI, the value for money is quite good — though the precondition is Meta still has to provide sufficient resources and funding for their research. So this "burn money" approach isn't very practical for us?

Junjie Yan: Yes, I think that's a good perspective. Burning money definitely helps, but burning money alone isn't enough. For Chinese AI companies, it definitely can't be copy-and-paste;本土 innovation is needed.

Yonghao Luo: Another trend: in the past Silicon Valley companies might have been 30% Chinese, 30% Indian; but now Chinese representation in Silicon Valley AI companies is far higher than Indian. Why is that?

Junjie Yan: Actually in AI it's been this way for a while. About a decade or so ago, AI's predecessor was computer vision, and Chinese representation was already high there. In the foundation model era, it's even higher. One reason is that Chinese people are quite smart, with strong math and programming abilities. Plus it requires sustained effort. People with all three characteristics are indeed often Chinese.

Yonghao Luo: When you were just starting out, what attracted the first wave of people willing to follow you — was it papers you'd published and academic achievements, or high salaries?

Junjie Yan: I think for them, the first factor was definitely belief — not belief in me, but belief in AGI itself. For excellent talent, money certainly has to be right, but more important than money is passion; money can only rank second.

Objectively speaking, China doesn't yet have a top-tier genius on the level of Ilya [Sutskever, OpenAI's former chief scientist and co-founder]. China has people with that potential, but they haven't become that yet. Whether they can become that in the right environment — I think that's what matters more to these talents.

In the foundation model era, the real product is the model itself

Yonghao Luo: By my understanding, companies that haven't bet on multimodality like you have — is it because they don't have enough resources?

Junjie Yan: I don't think it's entirely that. Probably more that everyone's underlying starting point is different.

Yonghao Luo: Does it relate to your monetization ability? Pure research companies like yours that can also do C-end products well are rare.

Junjie Yan: It's really about doing the technology well, making the industry need you, and commercial capability follows naturally. The essential logic here is "the AI foundation model itself is the product, and what gets built from the model — the applications — are actually channels."

Application forms derived from the model, whether for B-end or C-end markets, can realize their value and gain market recognition. Take video models: globally, tens of millions of two-level videos are produced daily now, with high value in creative fields — something unachievable just six months ago.

Yonghao Luo: This raises a question: if the model itself is the product, then doesn't the strongest model win everything? If so, isn't the whole industry disrupted — like certain departments internally just get eliminated?

Junjie Yan: Actually I've been thinking about this for years. First, I think every function has value.

Yonghao Luo: Sounds like consolation...

Junjie Yan: It's not. A core reason right now is that no single model is best at everything; they each have different characteristics. Some language models have very smooth conversation, some are better at coding, and so on.

Yonghao Luo: Didn't you say they'll merge later, and after merging the strongest one wins everything?

Junjie Yan: Even before merging, why are so many company models still being used, with everyone's user base growing? The core reason is an economic behavior. Like cars — many brands split the market. Different companies have different commercial attributes, so there are differences.

So actually everyone has their own space to survive, and the stronger the model capability, the greater the commercial value — the whole market keeps expanding.

AI is not an extension of the internet

but a new form of productivity

Yonghao Luo: With foundation models getting stronger, everyone says many positions will be eliminated. I think product managers' future might be one side heaven, one side hell. Give our product managers a diagnosis — where's the way out?

Junjie Yan: I suspect a new form will emerge. From my observation, one obvious change is that product managers can now use AI to write code and directly produce demos rather than PRDs [product requirement documents]. Engineers with good ideas can also create very creative products.

AI's emergence has largely lowered the barriers to both creativity and productivity simultaneously. Essentially, core competitiveness shifts from skills to "imagination" — no longer constrained by who can only write code, do product, or do algorithms.

Yonghao Luo: So you mean in the future even a very small company can accomplish big things, and the founder might be from a product manager background or an engineer background?

Junjie Yan: Yes, AI as a tool dramatically lowers the barrier to productivity, so essentially core competitiveness becomes "imagination."

Theoretically, whoever has the best imagination, whoever persists, and whose product delivers greater societal value — that's who succeeds.

My current understanding is that the AI industry shouldn't be a continuation of the internet. From this perspective, there's no need to fixate on the division of labor from the mobile internet era. Actually we ourselves feel this too — when a successful model turns out smarter than me, it feels a bit scary (laughs).

Talent philosophy and organizational innovation

at AI-native companies

Yonghao Luo: After DeepSeek blew up, from New Year until now it's been almost a year. Any major adjustments in your work during this period?

Junjie Yan: Yes, actually a very big adjustment, because companies like ours must improve every year — if you don't progress, you might die. DeepSeek made us realize that algorithms and infrastructure must be integrated, meaning everyone has to optimize for one unified goal rather than working separately. This actually caused very big organizational changes for us.

Yonghao Luo: Such a big adjustment? Do you use OKRs [Objectives and Key Results] internally?

Junjie Yan: Yes, that's a typical example — we tried OKRs and found they simply didn't work. That's also why I say the AI industry isn't a natural extension of the internet industry; it's a new industry.

Yonghao Luo: KPIs would work even less, right.

Junjie Yan: Don't work, so we have no organizational tools.

Yonghao Luo: Then how do you manage when the company reaches thousands of people?

Junjie Yan: I think AI-native companies' organizations resemble large models in some ways — different profiles of talent correspond to "modalities," capable of permuting and combining based on underlying principles to produce richer, more diverse outputs.

Of course there are underlying principles for talent selection: first, smart enough to have insight, self-learn, and proactively discover; second, passion, which can't be faked; third, this field doesn't allow individual heroism — it requires collaboration, requires good teamwork ability.

On this foundation, I think a person's responsibilities at different stages shouldn't follow internet-era definitions, but through organizational innovation, deploy people more flexibly, achieving different permutations at different stages. If we hadn't done some organizational innovation, many models simply couldn't have been built.

Yonghao Luo: Has the company encountered particularly difficult times? How do you boost morale?

Junjie Yan: When morale is low, usually you need to use first principles to break things down, get everyone to accept that something is feasible. Second, pay people more money, let effort get feedback. I think the underlying meaning is that everyone together has hope — even encountering setbacks, we can still make better things.

Three principles

driving MiniMax's continuous progress

Yonghao Luo: I think one thing outsiders don't understand about your company is that you're clearly a technology-driven company, so how did you make several successful To-C products?

Junjie Yan: First, we have excellent product managers who understand technology deeply and have strong product capabilities. Also, from the company's founding we established three principles: technology-driven, direct customer service, and internationalization — and we still adhere to them today.

Technology-driven is actually the hardest for us. Empowering business or products through model capability versus directly copying proven mobile internet success stories — both routes may be correct, but they can't coexist. We later realized that only the technology-driven route suits us better.

Yonghao Luo: So your fundraising has been very smooth — last round valued above $4 billion, right?

Junjie Yan: Two logics here: first, technical leadership; second, the revenue inflection that technical leadership enables. From a company operations perspective, we've been optimizing for these two goals, and each time we've achieved them, scale and magnitude have grown substantially. So our serious commitments to shareholders and employees have largely been fulfilled, making subsequent fundraising naturally smoother.

Yonghao Luo: Last question. At this stage the company has reached, what's your ultimate vision?

Junjie Yan: I think more about two questions. One, when will the AGI industry truly bring changes to GDP? Second, the industry is still led by American companies; at critical junctures, can Chinese companies surpass them?

I think China already has the environment for it, but China's AI talent needs to grow more confidently. I hope to see changes within three years.