WAIC 2024 Observations: AI Comes Down to Earth | Yunqi Capital Attent!on Tech Notebook

云启资本·July 10, 2024·9·0

Robots, foundation models, AI applications — what new heights are they reaching in China's race to the top?

500+ exhibitors, 1,500+ products on display, and 300,000+ offline visitors — the just-concluded 2024 World Artificial Intelligence Conference (WAIC 2024) set multiple new records, its energy matching Shanghai's sweltering near-40°C heat. From scalpers hawking tickets outside the gates to restaurants and cafes with no seats to spare, the side stories all pointed to the same thing: AI, once seen as esoteric "black tech," is now going mainstream — the first glimmers of becoming a technology for the masses.

Yunqi Capital and several portfolio companies also returned to WAIC with new insights and exhibits. Meanwhile, we picked up on emerging trends and new directions from the exhibition floor and speaker sessions.

While WAIC's buzz still lingers, we bring you this edition of Yunqi Attent!on Tech Notebook, reviewing what we brought to WAIC and what WAIC brought to us — through three keywords.

Center Stage

Robots

"The body is moving; the DNA is still catching up"

From the "Eighteen Arhats" array of humanoid robots displayed in the exhibition hall to the first-ever dedicated robotics zone, the suffocating crowds made one thing clear: robots were the undisputed center of attention at this year's conference. According to official figures, 56 embodied intelligence products made their debut at WAIC 2024, and aside from Tesla's Optimus 2, all were domestically made. Two of Yunqi Capital's early-stage investments, RealMan Intelligent and Keenon Robotics, were among them.

RealMan showcased innovative applications of its lightweight humanoid robotic arms and composite robots. Exhibits included a robotic arm that could easily lift a 5kg dumbbell and a "massage robot" that could tailor acupressure to different body points — products that brought robotics close to everyday life and production. Riding the wave of embodied intelligence, RealMan's robotic arm technology will continue to power advances in robotic manipulation capabilities, having already built a rich partner ecosystem. Its joint appearance at WAIC with partners Unitree and OYMotion exemplifies this.

Keenon Robotics brought its full lineup of new products covering dining, hospitality, healthcare, and education scenarios. The W3 hotel robot — capable of carrying up to 40 jin and delivering four orders per trip — and the Gen2 medical delivery robot with palm-vein recognition represented upgrades across multiple categories. Keenon also plans to integrate its foundational large model capabilities into its flagship T10 food delivery robot, boosting intelligence in task comprehension, perceptual decision-making, and analytical reasoning.

Among the varied robot forms, humanoid robots drew the most attention. Of the 42 intelligent robots at the conference, 22 were humanoid. Aside from Tesla's Optimus, which remained behind glass in solitary display, all other humanoid robot demos performed actions like gripping, pinching, holding, twisting, and walking — a vivid demonstration of how far robotic limb technology has evolved.

The Orca physics-accurate simulator from Yunqi Capital angel-round portfolio company Songying Technology was also applied to the humanoid robot "Qinglong" on display, empowering robotic movement coordination and autonomous decision-making at the software level.

The "Eighteen Arhats" array, RealMan Intelligent booth, Keenon Robotics booth

Yunqi Quick Take

The robotics frenzy at WAIC 2024 reflects the industry's — and the public's — expectations for embodied intelligence. Generality and generalization are its core technical and commercial advantages. The hope for robots in the embodied intelligence era is that they will autonomously learn from physical-world experience, expand their capabilities, and evolve from "one machine, one task" to "one machine, many tasks."

AI large models are the key to whether embodied intelligence's generalization advantages can truly be unleashed, determining whether robots can achieve step-change breakthroughs in perception, decision-making, and action. The foundational model approach for embodied intelligence has yet to converge, making targeted breakthroughs at the model level particularly worth watching.

No Longer Competing on Parameters

Large Models

"Now it's about multimodality and verticalization"

Large models remained a major protagonist at WAIC. Over a hundred large model products were on display, and models accounted for three of the eight "Treasures of the Exhibition" selected by the conference. But unlike the parameter arms race a year ago, "multimodality" and "verticalization" became the watchwords.

On the multimodality front, SenseTime's "SenseNova V5.5," positioned as a rival to GPT-4o, and Vimi, the first controllable human video generation model for C-end users, drew considerable attention. The latter can generate single-shot human videos over one minute long and was named one of this year's "Treasures of the Exhibition."

Junjie Yan, founder and CEO of Yunqi Capital angel-round portfolio company MiniMax, also revealed at the conference that MiniMax's video generation model is slated for release in August, with corresponding features coming to STARFIELD. At the MiniMax booth, we also heard a song with lyrics and composition generated by the MiniMax abab-music large model. Staff on site told us the composition feature will also launch on STARFIELD.

Verticalization was particularly pronounced in large model products from internet companies and state-owned enterprises, with dedicated models for finance, healthcare, education, tourism, and other sectors. Ant Group's Lingbai large model, for instance, powers the financial AI assistant "Zhixiaobao," while NetEase Youdao and Yuanfudao launched their "Ziyue" and "Kanyun" education models respectively.

Whether multimodal or vertical, the common thread is practical deployment. And to truly meet the needs of specific scenarios, continuous model capability iteration remains essential.

In a conference forum, Junjie Yan noted that the most critical problem for large models right now is still their relatively high error rate. GPT-4, for example, may only achieve 60-70% accuracy on many benchmarks — meaning a 30-40% error rate. This is why most large model products adopt a conversational format, as dialogue has higher fault tolerance. Reducing the error rate from 30-40% to 3-4%, or even 2% — dropping it by an order of magnitude — is the most crucial marker of AI evolving from a human-assistance tool to something capable of completing tasks independently.

Yunqi Capital angel-round portfolio company and large model unicorn MiniMax booth

Yunqi Quick Take

Over a year since GPT-4's release, the pace of model capability advancement has slowed markedly, shifting from rapid iteration to incremental iteration. But even subtle differences in models can produce vastly different user experiences, making foundational capability refinement and breakthroughs the current priority at the model layer.

Meanwhile, advances in multimodal and vertical large models will open greater space for application-layer innovation, pushing AI deeper into more scenarios.

On the Rise

AI Applications

"The new protagonist after large models"*

A careful walk through the exhibition revealed that this year's WAIC floor space leaders were largely also AI application leaders. Alibaba, iFlytek, and WPS pulled out all the stops in their booth designs to show visitors how AI could weave into every corner of work and life.

Software applications and features on display spanned C-end, B-end, and G-end markets. But most scenarios and functions revolved around productivity enhancement. Agent capabilities, hotly debated for over a year, also made early appearances in some tools — such as the AI intelligent assistant launched by Alipay, which lets users issue commands like ordering takeout or topping up phone credit, with the AI identifying the instruction and directly connecting to the corresponding mini-program within Alipay's ecosystem.

A number of hardware applications with AI features also drew crowds. iFlytek's smart blackboard, which can digitize chalk handwriting in real time, was among the conference's popular photo spots.

For autonomous driving, a major AI application domain, WAIC 2024 also dedicated a special zone. Yunqi Capital early-stage portfolio company JueFX Technology showcased its data closed-loop intelligent driving large model, demonstrating solutions deployed in urban and highway NOA scenarios.

At conference forums, the prospects for AI application deployment also became a major topic of discussion. Yunqi Capital partner Chen Yu, speaking at the "AI Innovative Applications and Investment Trends" forum, noted that investment in AI over the past year had concentrated on foundational models, but the next two to three years would see the focus shift to AI applications. Productivity enhancement, AI for Science, embodied intelligence, and entertainment are several key directions for large model applications.

On the prospects for AI in the B2B domain, Yunqi Capital executive director Han Yi, at the "Yangtze River Delta Collaborative Innovation and AI New Quality Productive Forces Development Forum," analyzed that marketing and customer acquisition alongside internal efficiency gains are the two core problems AI can solve for enterprises. Around these two needs, AI has rich application forms in the B2B space.

On the frequently discussed topic of killer apps, MiniMax founder and CEO Junjie Yan believes it may take about three years before something truly mass-market emerges. "But that's okay — when you can be the first, and then your capabilities grow, your resources increase, your technology improves, you can probably get there."

Yunqi Quick Take

The rise of open-source models and the rapid decline in large model pricing have lowered the barrier and cost for AI application developers, creating favorable conditions for AI application innovation from both technology and cost perspectives. With the AI toolkit at their disposal, people or teams with strong product capabilities will see their advantages amplified.

But constrained by the inherent capability limitations of large models and other factors, a "ChatGPT moment" for the application layer may not have a clear timeline. Advances at the model layer remain the key to an application-layer explosion.

WAIC 2024 has concluded, but innovation continues. Yunqi Capital and our portfolio companies will keep deepening our work in AI and sharing our observations and thinking. We look forward to witnessing more innovation together.