Yunqi Capital in Conversation with MiniMax's Junjie Yan: Looking Back at 2024, What Changed and What Stayed the Same in AI Entrepreneurship | Yunqi Doers

云启资本·January 6, 2025·14·1

On Original Innovation Intent, Technical Roadmap, and Product Portfolio

AI, embodied intelligence, going global... in 2024's tech and venture capital scene, few could avoid crossing paths with these terms. As the year drew to a close, the "Yunqi Doers" column launched a year-end special series, where we sat down with Yunqi Capital's portfolio founders working at the forefront of these sectors. We discussed the shifts and opportunities in these fields over the past year, hoping to bring you firsthand perspectives and insights from entrepreneurs on the ground.

Becoming a unicorn in just two years, surpassing 3 billion daily interactions on its large model products, and ranking in the global top 10 for monthly active users of AI applications — from the outside, MiniMax's trajectory since late 2021 has seemed like an unexpected "turbocharge," much like the AI 2.0 wave unleashed by GPT. But for MiniMax itself, achieving AGI was a north star set long before ChatGPT captured public imagination; everything since has simply followed from that.

In March 2021, Yunqi Capital's investment team met Junjie Yan, who had not yet formally founded MiniMax. Several months later, they invested in MiniMax's angel round, becoming the company's earliest angel investor. Over three years have passed since, and generative AI technology and applications have evolved at breakneck speed, with the AI startup landscape entering a new round of reshuffling. At this juncture, how does Yan reflect on MiniMax's original entrepreneurial vision, and its technology and product choices? From his vantage point as a founder on the front lines, how does he view the breakthroughs of 2024 and the competitive advantages that come with them?

In the winter of 2024, Yunqi Capital founding partner Chengyu Mao and Junjie Yan had an in-depth conversation. The first installment of "Yunqi Doers · Year-End Series" shares highlights from their exchange.

*AI continues to evolve rapidly; given the time gap between this conversation and the present, the views expressed are for reference only.

01 On AGI and Entrepreneurial Vision

Chengyu Mao: I remember we first met on September 30, 2021. Chen Yu met you first and was incredibly excited, urging our other two partners to meet with you as soon as possible. We arranged to meet near Lujiazui along the Huangpu River — and now three years have flown by.

Junjie Yan: Chen Yu was the first investor I ever met; I connected with him back in March that year. At the time, I hadn't decided to start a company for certain. I just sensed that artificial general intelligence was the right technological trend. There wasn't even the term "large model" yet — I had simply noticed that in the preceding months, some promising signs of foundation models had emerged in the US. I thought then that if we could build foundation models, they would possess generality, and with generality, there'd be no need for customization — they could serve many people and become highly standardized products. That would make very solid sense in China.

Chengyu Mao: By 2024, the industry had developed many different views on AGI and large models. In October, there was a flurry of statements from industry "heavyweights." Sam Altman was fairly optimistic, saying AGI would arrive in a few more years. The CEO of DeepMind also believed AI could do anything in the next decade. But I saw Yann LeCun's view, which essentially told people not to research language large models. What's your brief take on these perspectives?

Junjie Yan: First, I still believe that whether a product is to C or to B, it fundamentally depends on the model's underlying capabilities. On model capabilities, I think several metrics matter:

First is error rate — I consider this the most basic metric affecting model experience. For instance, why can large models handle Q&A reasonably well most of the time, yet often fail in professional scenarios or when acting as agents? Fundamentally, it's because the error rate is too high. Solving complex problems requires multi-step reasoning, and with multiple steps, error rates essentially multiply. So a core challenge is how to drive error rates down by orders of magnitude.

Second, unlimited-length input and output. Whether a person lives to 100 or 60, life is a continuous journey. The total information one processes in a lifetime is probably on the order of a billion tokens. What will I do tomorrow? It's largely determined by my historical and present self — because it's essentially a billion-token-scale context.

Third is multimodality. Consider what we do on our phones. Most interactions are multimodal — images, video, and yes, text. But text actually doesn't dominate for most people. If we believe AI will become increasingly mass-market, it should take a multimodal interaction form.

Chengyu Mao: MiniMax was probably among the earliest in this AI wave to both build large models and develop products, and simultaneously pursue both overseas and domestic markets, both productivity tools and AI content platforms. In my 25 years of investment experience, I've rarely seen a startup with limited short-term resources and energy spread across so many directions. What underlying framework drove this choice?

Junjie Yan: Two companies stood out for their approach during the mobile internet era. One was Meituan, with its super-app strategy — starting with group buying, then food delivery. Once food delivery was established, more lifestyle services were folded into the app. The premise was that food delivery was a high-frequency entry point, into which lower-frequency services could be embedded. That was Meituan's playbook.

The other was ByteDance's approach — everything was recommendation-driven, but different content categories mapped to different products, one product per app, with each product an order of magnitude larger than the last. I believe we're following ByteDance's playbook.

Chengyu Mao: You had this in mind from the start?

Junjie Yan: Yes. We made this choice because we believe technology is changing rapidly, making it obviously difficult to build something for five years from now. What we should do instead is build what we consider the right product for today's technology. The users, traffic, talent, and technology we accumulate will then support building larger products.

02 On China's AI Breakthroughs and Competitive Advantages

Chengyu Mao: In your view, what was the single biggest breakthrough in global AI in 2024? What surprised you?

Junjie Yan: The first advance, I believe, was multimodality. Around this time in 2023, cinematic-quality AI video generation was still hard to imagine — at best, generating a single image.

Second, I think it was the emergence of a pipeline for solving vertical-domain problems. Around this time in 2023, people knew large language models were usable, but they weren't quite reliable, so the approach was to simply fine-tune with vertical-domain data. That was last year's thinking. But this year, with something like o1, the question became: how do I solve a vertical-domain problem? It transformed into: as long as you define the reward function within that domain, there's a pipeline of synthetic data plus reinforcement learning that can bring it to a certain level in that vertical. This pipeline won't stay limited to math and code in the future — it can extend to many more domains. This provides a path to reach professional-level performance within professional fields.

Chengyu Mao: Bringing it back to China, what important breakthroughs do you see in China's AI industry over the past year? How large is the gap between Chinese AI and the global leading edge?

Junjie Yan: First, in multimodality, China is actually fairly leading — at least in video and audio. I believe images will catch up quickly. This leadership has been proven through international competition, whether in product user numbers or rankings. The scale of China's multimodal models is indeed substantial.

But what Chinese companies haven't yet achieved is a model truly at the GPT-4o level in practical use — there's an objective gap, whether for major tech companies or startups. While rankings and academic benchmarks can be pushed high, real-world usage still shows an objective gap. MiniMax's goal is to truly achieve this with our next-generation model. This is probably the only remaining短板 (shortcoming) among models currently available on the market.

So from the perspective of models that can already be deployed, I think China has basically caught up across the board except for text. Text still needs some time — perhaps half a generation behind. And if we account for models still in development in the US, there may be a gap of half to one order of magnitude.

Chengyu Mao: "Going global" was a buzzword in 2024. In other industries, we've seen that companies or products that survive China's intense competition tend to remain competitive when they reach overseas markets. MiniMax was among the earlier and more advanced players in taking AI large language models overseas, so I'd like to hear your understanding of Chinese AI products' advantages in going global.

Junjie Yan: This industry is fundamentally technology-driven. The core of technology-driven competition lies in what differences exist in technology markets. China currently does have genuine strengths in multimodality-related technologies.

Second, China doesn't just have an engineer dividend — there's also a product operations dividend. Some American companies essentially have no operations roles at all. If we combine our technology dividend with product-level innovation, including operations, it's essentially moving from pure technology tools to content-oriented products — that's a dividend too.

Chengyu Mao: So it's about not lagging on technology, or maintaining comparable standards; and on operations, like TikTok or Temu, leveraging China's operational strengths to remain internationally competitive.

03 On Market Competition and Shifting Demand

Chengyu Mao: From the outside, a major source of pressure hanging over domestic AI startups in 2024 was competition from tech giants. How do you view the relative strengths and weaknesses?

Junjie Yan: First, I think most of China's major tech companies are genuinely very strong. They became giants because their talent, organizational efficiency, and commercialization support are all at high levels. And the more established companies have only evolved these capabilities further. Many maintain very strong innovative vitality, with genuinely more resources.

When you first start out, the giants aren't doing what you're doing, so you can develop independently. When they later enter your space, there was initially significant pressure. But two realizations removed that pressure.

The first is some basic common sense: how well a product performs, in certain scenarios, fundamentally depends on metrics like user retention; further down the line, it depends on commercialization efficiency. There's no shortcut for this — it requires spending sufficient time to refine, to be with users, to achieve stronger technological innovation.

So while money and resources may differ across players, time is the same for everyone — and startups may actually have more time, because they're more focused. So I feel that at least from a temporal perspective, it's fair for startups.

Chengyu Mao: I suppose it's precisely this intensely pressure-filled competition, or "involution" if you will, that has driven AI large model costs down so rapidly. I recall you mentioning that as these costs fell, you saw increased usage and demand from traditional enterprises. As an investor with substantial exposure to B2B companies, Yunqi is quite interested — what changes have you observed in how traditional enterprises use AI?

Junjie Yan: The changes are quite interesting. When large model prices were still high in 2023, API users were primarily to C companies — essentially internet companies.

After costs dropped, I found that a major use case was labeling — all kinds of internal enterprise data labeling. The volume here was much larger than we expected. Because even before large models emerged, companies doing security, public sentiment analysis, data structuring — many such firms relied on various models for labeling.

Now, large models are first of all cheaper, not more expensive than traditional models; second, their generality is better. Once customers adopt such scenarios and get them running, demand tends to persist and expand to more scenarios.

Data processing-type demand actually represents a very large proportion and is growing very rapidly. This makes me feel that AI applications have become embedded within many enterprises, becoming something genuinely useful to businesses.

04 On MiniMax's Technology Evolution Path

Chengyu Mao: One impression MiniMax gives is that it was both founded early and among the earliest to use large language models for to C products — even earlier than overseas peers. Beyond that, you may also have been the earliest to commit substantial resources to MOE architecture when it was still non-consensus. In 2024, you were also among the leaders domestically in introducing a technical architecture based on Linear Attention characteristics. What considerations drove these technology choices?

Junjie Yan: These choices actually brought us many challenges. But the specific technologies matter less than the fact that they progressively built our R&D capabilities.

In 2022, when we first started, our first-generation model was essentially reproducing someone else's paper. Every detail was there; we just replicated it.

The reason we chose the MOE direction was that we found we couldn't scale parameters further with dense architectures — at least we didn't have enough compute to scale that way. So if we wanted to scale, MOE was the only path. But we only knew the general direction; the details were unknown. So 2023 became a transition from reproducing based on others' papers to figuring out the details ourselves.

2024 was different again. We came to realize that to advance further — say, to solve the unlimited-length input and output problem — we absolutely needed Linear Attention. This meant we not only lacked referenceable details, but there wasn't even a clear direction, because American companies hadn't done this at such scale either. The direction itself had to be determined by us.

But I think we were relatively fortunate in pursuing this — it took about half a year to achieve. Not because we're exceptional, but because our work on the previous two generations had accumulated sufficient experience. So I see this as fundamentally a three-year journey for a technology startup, from completely reproducing papers to beginning to possess core innovative capabilities.

Looking two, three, or five years into the future, if China is to have leading companies emerge in artificial intelligence, I believe they must satisfy two conditions: first, possess genuinely distinctive technological innovation; second, achieve sufficient scale in commercialization and products. I feel that in 2024, at least on the point of distinctive technological innovation, we succeeded once.