200 Days of Frenzy: China's LLM Wars Stuck in a Stalemate

暗涌Waves·July 7, 2023

Consensus can form quickly — and dissolve just as fast.

By Lixin He

Edited by Lili Yu, Jing Liu

200 Days at Breakneck Speed

Two months ago, Waves posed a question to a fund partner who had invested in Wang Huiwen's Lightyear AI: In the history of Chinese venture capital, which company was universally embraced from day one and actually delivered a happy ending? "Honestly, I'm struggling to think of one," the investor replied.

As an "outsider" to the AI industry, Wang Huiwen's dramatic entry carried particular theatrical weight — what were his real odds of success? After a moment's pause, the partner quipped: "At least, Lao Wang is a controversial figure." The subtext was clear: for most top-tier dollar funds, there was little reason not to back someone who could "concentrate resources, capital, and influence in one person."

The rest is history. Following Wang's diagnosis with depression and Lightyear AI's acquisition by Meituan, the most closely watched story of China's large model startup wave came to an abrupt halt.

When we asked that question, China's large model entrepreneurship was still in overdrive. People fervently believed in the platform-level opportunity "ten times bigger than mobile internet": within just 20 days of GPT-4's release, over a dozen startups had already taken seats at the large model table, raising billions of dollars in total. Subsequently, more than 20 companies — including major tech firms — announced their self-developed AI large models. "We were running out of names from ancient mythology."

In an instant, Lightyear AI's dramatic arc seemed to confirm the prophecy that general-purpose large models would rarely belong to startups.

More and more companies began fleeing the large model mythology.

Rewind one year. At the intersection of 18th Street and Folsom in San Francisco stood an unassuming three-story gray building, soon to become world-famous — OpenAI's headquarters. A Silicon Valley source told us that after Neil Shen, founding and managing partner of Hongshan, met with OpenAI, he was "profoundly shaken." This voracious investor immediately told his team: "Get moving!"

Hongshan once again demonstrated its characteristic combat readiness. "Among domestic investors, Neil's understanding is probably the furthest ahead," the source said. Shen began pushing AI investments more aggressively, "meeting everyone he could," including Harry Shum, Yang Zhilin, and Alibaba CTO Jingren Zhou. The outside world saw the result in September 2022, when Hongshan's website published "Generative AI: A Creative New World," coining the term "generative AI" for the first time.

"It takes about three months for venture trends to travel from Silicon Valley to China," a dollar fund investor told Waves. So when ChatGPT launched in late November, domestic discussions about large models didn't explode until after Chinese New Year.

After the holiday, Yungang Huang, partner at Source Code Capital, boarded a flight to Silicon Valley intending to also evaluate SaaS and biotech. But nearly every meeting ended up being AI-related. By then, booking time with OpenAI people was no longer easy: employees had largely closed or hidden their personal contact information, including LinkedIn.

This was probably the fastest consensus convergence from Silicon Valley to China in the past decade.

Before generative AI, the already exhausted mobile internet suddenly seemed like a relic. A new world order appeared to be handing off.

Everyone could feel the market's restlessness. In mid-February, at an AI sharing session organized by a Microsoft strategic incubator, the room was packed, the tea break area overflowing, even the barista filming presentation slides from behind the counter. The surrounding NFT prints on the walls seemed to remind people that this space had belonged to Web3 just months before.

Like a stress response, domestic investors plunged into AI only to discover how much homework remained. Multiple investors interviewed described themselves as "still learning" and volleyed back: Who else have you been talking to?

In March, the day before our interview, Alpha Startups founding partner & CEO Siqing Xu was "reading papers" past midnight when he received a WeChat from a successful entrepreneur friend asking to discuss AI. The latter drove 20 kilometers to Xu's place, and the two talked until 3 a.m.

Venture capital's big and small names came in droves — Wang Huiwen among the fervent. Someone close to him revealed to Waves that Wang's attitude toward large model entrepreneurship shifted extraordinarily fast: originally planning to simply invest in a company as a shareholder, he decided within three days to dive in himself.

But windfalls never last long. Though strictly speaking, compared to the many late-mobile-era themes, large models or AGI are unquestionably real propositions.

Within just 20 days of GPT-4's release, the market already sensed that in this game destined for a select few, China's general-purpose large model startup wave had effectively concluded its first battle.

In late June, the social media spat between Cheetah Mobile CEO Sheng Fu and Allen Zhu, managing partner at GSR Ventures, highlighted both the divergent perspectives of investors and entrepreneurs and an unusually sober consensus: there's opportunity, but not a BAT-level one.

In just 200 days, more investors and startups began shifting focus toward vertical large models, middleware, and application layers. That feverish platform-level or disruptive opportunity was gradually replaced by more pragmatic "scattered small opportunities."

Yusen Dai, managing partner at ZhenFund, once told us that as waves of new technologies rose and fell, AI cycled through winters and springs in the venture world. Before every seemingly lofty technical ideal hung a Damocles sword of commercialization difficulty.

This time was no exception.

Towers Built on Shifting Sand

Consensus can form quickly.

ChatGPT's release pushed domestic FOMO to its peak early this year, with teams and capital rapidly assembling around large models. The players at the table divided into: internet entrepreneur faction, big tech faction, and academic faction from universities and research institutions.

Consensus can also collapse quickly.

The inherently cash-burning nature of the game, barriers around compute, data, and talent, and shifts in today's capital market left the question "Are large models an opportunity for startups?" perpetually hanging overhead.

The reality was that by April, Waves had already noted that the first battle of China's general-purpose large model startup wave had essentially concluded. Several investors later confirmed this assessment — "That's pretty much it" — and startups claiming to enter large models virtually disappeared thereafter.

In one primary market observer's view, regarding large models, not just entrepreneurial teams but even major funds willing to play were few in number, and institutions mainly bet on people, with long proofs still ahead.

The day after his debate with Fu, Allen Zhu posted on social media that his core view was: don't be superstitious about general-purpose large models, because GPT-3.5 would become a commodity next year, and GPT-4 would follow in three years.

This revealed another hidden concern about large models: the underlying models themselves were changing, and in the future would likely see mass open-sourcing, or one to two dominant players winning it all.

Thus, the value and investment of China's large model entrepreneurship were fundamentally mismatched.

As for that repeatedly cited "opportunity bigger than the internet," Yusen Dai believed the premise was "building AGI that can use tools, solve tasks, and decompose tasks" — and teams capable of this were scarce even globally.

At a recent roundtable at the Waves Conference, Professor Zhiwu Lu of Renmin University's Gaoling School of Artificial Intelligence questioned the so-called "spring of domestic large models." In his view, it was largely an illusion of many companies "fine-tuning foreign base models."

Reality bore this out. One AI entrepreneur told us that many startups claiming to build large models were actually using shortcut techniques like Supervised Fine Tuning to produce a "decent enough" language model from the start. Teams and projects with genuine capital and technical strength to challenge GPT-4 were few and far between.

More startups began migrating toward vertical large models in healthcare, law, and other domains, as well as middleware and application layers. Wang Huiwen's Lightyear AI and Wang Xiaochuan's new company later both chose to pursue large models alongside model-based applications.

Data from "42Chapters" founder Kai Qu showed that, based on his sense, among recently funded AI projects, roughly 10-20% were building base models, 20-30% were doing infrastructure/middleware, and 60-70% were building applications. If unfunded projects were included, applications likely accounted for at least 95%.

But the vertical large model and application path was no easy road either. For startups, vertical domain scenarios and data were hard to acquire. And the capabilities built couldn't be easily covered by general-purpose large models.

For infrastructure/middleware, one investor noted on social media that startups addressing MLOps needs like data collection, annotation, and model scheduling faced an awkward "middleman can't capture the spread" squeeze — free open-source tools on one side, cloud vendors bundling tools and services on the other. Meanwhile, "domestic customers' willingness to pay still hasn't been well cultivated, especially during this economic recovery period of tightened corporate spending."

At the Waves Conference, Yusen Dai of ZhenFund noted that doing B2B services in China faced constraints from market payment willingness and customer procurement patterns. "A major characteristic of China's internet was that directly charging users was hard — often the wool came from the pig." While OpenAI and Claude could sell API services directly through public cloud in the US, in China simply providing APIs wasn't enough. "Many large model companies targeting enterprise customers now sell servers bundled with models, plus provide training and fine-tuning services."

The social media investor also noted that application-layer projects fell into two categories: established projects deeply rooted in vertical scenarios actively integrating large models, fine-tuning with their data; while new projects talking about vision was premature — before large models' iterative capabilities were fully unleashed, they risked "quick birth and quick death."

This was already manifesting overseas. Last year's high-flying American unicorns Grammarly and Jasper saw their existing functions replaced and value rapidly diluted after GPT-4's release. Allen Zhu publicly stated that "these two companies may soon go to zero, they simply can't defend."

In March, OpenAI published a paper open-sourcing new model code: one-step image generation, 18 images per second. Some declared "the era of diffusion models is over." Yet it had been less than a year since diffusion models became a foundational technology pillar of 2022's "AIGC Year One," spawning numerous models.

Thus AI entrepreneurs in this wave, especially application-layer companies, existed in a perpetual dilemma of fighting themselves: don't do it, miss out and lose; do it, risk quick replacement and still lose.

At the Waves Conference, Fang Han, CEO of Kunlun Tech, mentioned that after speaking with China's top product managers, he found them still in a dazed state: "This wave of large models far outpaces product progress." In his subsequent solo talk, Cheetah CEO Sheng Fu quickly rebutted this: "Product managers aren't dazed, many are already acting." But clearly, no satisfactory killer product had yet emerged. On the days Alibaba and Baidu launched their large models, their stocks dropped to varying degrees.

These characteristics created a peculiar spectacle in primary markets: beyond Hongshan, ZhenFund, Source Code Capital, 5Y Capital, IDG Capital and others, "institutions were cautious about new deals, more actively pushing existing portfolio companies toward AI to raise more funding."

In some AI investors' eyes, beyond model upgrade issues, if AI safety conflicts — the concern more relevant to general audiences — intensified further, it could send the AI wave into another trough. How long would this round of AI faith last?

A Decade-Long AI Dream

For over ten years, AI hype has cyclically recurred in the venture world. The constant evolution of technical approaches has filled the industry with stories of "researching how to sharpen a faster knife before guns exist" and "discovering unrecognized guns."

Just as the deep learning approach was overlooked before 2012, at the height of AlphaGo fever in 2016, artificial general intelligence was widely considered impossible by the industry. "When GPT-1 launched in 2018, it seemed like a heretical path," Zhang JinJian, founding partner of Oasis Capital, once described to us. The mainstream industry approach then was vertical models with manual annotation — "like intricate carving" — while GPT pursued generality, "brute-forcing with massive data, which seemed crude to academia."

Zhifei Li, founder & CEO of Mobvoi, recalled starting large model development two years ago: the team faced enormous pressure, with the technical director threatening to resign multiple times. Yet this was already three years after Google's 2017 Transformer paper opened the first door to general artificial intelligence — though few had recognized its significance.

Tracing history, since the AI concept emerged in academia in the 1960s, this century alone has seen two AI waves.

In 2012, at the world's largest visual recognition competition, 65-year-old Professor Geoffrey Hinton led two students to victory. The breakthrough came from a new AI research paradigm: the neural network school represented by deep learning, rising from over 20 years of academic obscurity to mainstream orthodoxy.

Over the following decade, deep learning became the underlying technical foundation for most AI enterprises, moving from academia to industry with early applications in vision, speech, and semantic technologies.

In China, speech recognition gave rise to Mobvoi, iFlytek, Unisound and others; image recognition produced the "AI Four Dragons" — Megvii, Yitu, SenseTime, CloudWalk — and 4Paradigm.

In 2016, Google AlphaGo's decisive victory over world Go champion Lee Sedol in a human-machine match made "machine intelligence defeating humanity" widely recognized at the popular level for the first time. This rapidly triggered a global AI arms race and soon garnered national policy support.

In this AI fever, big tech announced All In strategies. Qi Lu parachuted into Baidu; Tencent, ByteDance and others established AI Labs; Alibaba DAMO Academy was founded; Jack Ma proclaimed "100 billion yuan investment over three years."

At major tech forums, people endlessly discussed "the singularity has arrived" and the Three Laws of Robotics. Investors firmly believed AI would be the fourth productivity revolution after the steam engine, internal combustion engine, and internet.

The venture industry then faced a scarcity of investment themes (not dissimilar to today). Major internet mergers had concluded, platform opportunities faded, and giant tentacles reached everywhere. AI, alongside live streaming/short video and bike-sharing, became a hot赛道.

AI investment and financing turned feverish. Reports showed 2016 global AI funding nearing $10 billion, equivalent to the total raised across the 13 years from 2000 to 2013. One telling indicator: amid a depressed 2016 global stock market, NVIDIA's stock still tripled.

But fatigue soon appeared. In 2019, China's AI investment amount and deal count dropped sharply, with 90% of AI startups losing money. The sharp cooling began.

Per IT Juzi and other data, from 2014 to 2018, average IPO exit returns in China's AI sector were merely 1.83x. In 2018, nearly 90% of AI companies were in the red. In 2019, "investors fleeing AI" went viral. After several years of desolation, beyond Legend Star, Sinovation Ventures and a few others, few domestic investors consistently followed AI, and major funds barely had dedicated coverage.

One could say that beyond a handful of early investors who cashed out, AI remains a赛道 that hasn't made investors real money.

One repeatedly cited investor anecdote captures the carnage: DeepGlint, founded in 2013, reportedly after its angel round, Bob Xu declared at a dinner it was worth at least $500 billion, while Neil Shen thought $100 billion more realistic, ultimately compromising at $300 billion. Reality surprised everyone: nine years later, after a bleeding IPO, DeepGlint finally listed on the STAR Market last year with a market cap of 6.5 billion RMB (as of July 6 closing).

AI's decade-long journey remains arduous.

In the view of Mingyao Wang, Megvii's first investor and president/managing partner at Legend Star, AI entrepreneurs a decade ago were in an exploratory stage, mostly from academia with unclear thinking about monetization, coupled with immature industrial配套, together contributing to AI's long commercialization road.

In 2011, Legend Star decided to support three young entrepreneurs — Megvii was valued at just 14 million RMB. After the angel round, to avoid being stuck with no RMB funding, the company switched to a USD structure. The capital market's low expectations then meant "early AI entrepreneurs had a very hard start." Megvii's pivots from CV gaming to dating social to product recommendation all struggled, until its 2015 facial payment collaboration with Alipay. Wang recalled the company "didn't get its first government security order until five years after founding." Today's market is incomparable.

This makes the venture world's renewed AI fervor appear especially headstrong. But the biggest difference in this technological advance is — for the first time, AI has the possibility of being general-purpose.

If the past decade's two rounds of deep learning-driven AI innovation were still point-distributed, task-specific intelligence for vertical industries, this round's large models represent what Kai-Fu Lee called progress "from islands to continents": no manual annotation needed, large model scale, cross-domain capabilities.

Technological breakthroughs violently reshape the old world. One internet investor told us that under this new wave, companies like SenseTime and Megvii at least retained substantial compute and experience reserves. For more AI enterprises, as technology evolves, they may "collapse halfway through their journey."

A World That Can't Be Returned To

"Holy shit!" Hurst Lin blurted out.

This was his instinctive reaction in early 2022 upon first seeing text-to-image results, when DCM reached out to Tiamat founder Qinggan. This successful entrepreneur who lived through the internet wave, still active as a classical internet investor, didn't hide his astonishment when describing the moment to Waves. He firmly believed "a new generation AI wave has truly arrived," "no longer following TMT-era recommendation logic, but directly completing everything for people — they don't even need to move the mouse."

Though AI investment still walks through fog, likely to remain subdued in the near term, this doesn't stop its sustained bombardment of the old world: SaaS, going-global, and numerous existing business models face imminent AI rewriting.

One long-time enterprise services investor believes that future Chinese SaaS companies — indeed all B2B enterprises — should be AI companies, with software replaced by intelligence-as-a-service. AI reduces service costs and improves efficiency on one hand, while打通 service workflow links on the other. Remaining a traditional software company means "basically no chance."

The storm similarly sweeps existing AI entrepreneurs; the crisis from failed technical approaches is even more severe: last wave's AI companies based on deep learning with vertical small models must either revolutionize or die.

Zhifei Li gave an example: many NLP practitioners previously thought these changes wouldn't affect them. "There used to be PhDs or professors dedicated to syntactic parsing, part-of-speech tagging — these intermediate steps will all disappear in the future." Many practitioners have finally realized there shouldn't exist specialized jobs for machine translation, question answering, or speech recognition. Without transformation, they face unemployment or irrelevance.

In some investors' view, as productivity supply structures are reshaped, unlimited junior engineers will be AI-replaced. The "engineer dividend" long celebrated as part of China's commercial confidence narrative may cease to exist.

Fangbo Tao, founder of Mindverse, believes that facing AI, there will soon be only two kinds of people: "drowners or gold miners," then反问: "When steam trains arrived, did they only affect carriage drivers?"

Yuan Liu, partner at ZhenFund, stated that for investors and entrepreneurs, this means "the opportunity for three to five people to take down a big company" exists again. He even felt that "all previous accumulation was precisely preparation for this moment."

Liu entered the industry in 2014. As an early-stage investor, he "occasionally felt born at the wrong time": missing mobile internet's best years of 2011-2012, while later waves like dual carbon, new energy, and automotive were extremely capital-intensive. Suddenly, the data flywheel, disruptive innovation, product thinking and empathy that TMT investors knew well — "all seem useful again."

In interviews, Liu repeatedly cited Wittgenstein's famous line to us: "The limits of my language mean the limits of my world." He said this was what excited him more about AGI: if human thinking is a language process, then the world humanities students imagined might actually be realized on large language models.

Currently, after the first phase of competition, many investors seem more optimistic about big tech's large model experiments.

However, as Liang Wenfeng, founder of High-Flyer who recently entered large models, said: "Markets change. The true decisive force is often not existing rules and conditions, but the ability to adapt and adjust to change." And this may be startups' opening.

While cheering the new era's arrival, Hurst Lin also believes many current large models risk being "blocked by lack of applications." He analogized the AR/VR赛道 fever in the US five or six years ago: Google Glass burned countless dollars but remains unrealized. Moreover, in this AI revolution, software hasn't yet fully combined with hardware. "Many things we still have to see."

But regardless, in Lin's view, ChatGPT is like a glass door — once crossed, there's no going back: "AI is the new internet."

This startup wave, barely 200 days old, may be emblematic of future venture stories: the direction is undoubtedly correct, but the road is destined to be long.

Image source: Christ in the Storm on the Sea of Galilee (Rembrandt van Rijn, 1633), Isabella Stewart Gardner Museum, Boston

Layout by Yunxiao Guo