Shouxing Technology Secures Hundreds of Millions of Yuan in New A1 Round; Humanoid Robot Makes Cover Debut in Science Robotics

五源资本·April 7, 2026·26·21

Humanoid emotional-interaction robotics company AheadForm has announced the completion of a Series A1 round totaling several hundred million RMB.

AheadForm, a company building ultra-high-fidelity emotionally interactive robots, announced it has raised hundreds of millions of RMB in its Series A1 round. This round was co-led by Huakong Fund and a major internet company, with follow-on participation from Jiayu Capital, Pengrui Fund, Beijing E-Town International Investment & Development, Shanghai Semiconductor Industry Investment, Shenzhen Nanshan Strategic Emerging Industry Investment, and existing investors China Merchants Venture Capital, Shunwei Capital, HLC, Houxue Capital, and Oriental Fortune Capital.

The proceeds will primarily fund continued iteration of its multimodal embodied interaction system and emotional foundation models, scaled optimization of bionic facial core components and materials systems, and standardized delivery and global market expansion.

The Human Face: Our Native Interface for Emotional Connection

The significance of embodied intelligence lies in a comprehensive leap — AI becoming physical. Beyond the locomotion and manipulation capabilities that most people intuitively focus on, it involves projecting "intelligence" into embodied interaction and expression, ascending from text and voice to three-dimensional, face-to-face engagement and resonance. Once large language models solved the problem of "understanding and generation," the real challenge shifted: how can a physical entity comprehend behavioral logic, express emotional intent, and synchronize with human consciousness and form? In multimodal interaction, the human face is among the highest-density channels for emotional and intentional expression. Humans are highly sensitive to facial information and rely on facial expressions, gaze, and micro-movements to recognize emotions. By contrast, systems lacking humanoid features struggle to trigger users' emotional projection and sustained trust-building. Embodied systems with human facial forms and dynamic expression capabilities are therefore more likely to form stable emotional connections and sustained user engagement during interaction, enabling deeper human-machine bonds.

Pioneering the Era of Bionic Emotional Robots

AheadForm founder Yuhang Hu has long been dedicated to endowing robots with "self-model" capabilities — constructing internal representations of their own physical structure and movement so that robots can better understand themselves and adapt to varying forms, environments, and tasks. In the direction of bionic human-machine interaction, he proposed an integrated system for emotional understanding and expression that fuses voice, vision, and motion, providing robots with more natural interactive capabilities. Through self-supervised learning mechanisms, his approach enables robots to continuously improve human-machine interaction quality without human intervention, advancing toward intelligent agents with lifelong learning capabilities.

Dr. Hu was among the earliest academics to lead research on bionic emotional robots after embodied intelligence technology emerged.

In 2023, his research on facial robots made the front page of The New York Times, bringing this field's technical breakthroughs and potential into broader public view.
A 2024 Science Robotics paper (Human-Robot Facial Coexpression) achieved sub-second emotional resonance for the first time. The robot no longer merely mimicked expressions but predicted the onset of human smiles through a learned inverse kinematics model — a major breakthrough in human-robot facial interaction.
In 2025, he published papers in Nature Machine Intelligence and npj Robotics on visual self-supervised learning frameworks and robot autonomous learning models, enabling adaptive control even when robots encounter anomalous states, freeing them from preset equations and human supervision.
At NeurIPS 2025 (Creative AI track), he presented work on visual robot self-supervised learning that taught robots human-like skills.
In January 2026, he secured a Science Robotics cover article (Learning realistic lip motions for humanoid face robots), realizing autonomous AI-driven learning for facial robots with breakthrough generalization capabilities that could cover multiple untrained languages. From this point, facial robots escaped "rule-driven" puppet-like motion and entered the stage of "AI data-driven" continuous motion generation.

Prior to this, he had also conducted frontier exploration on various hardware platforms including legged robots and robotic arms, covering reconfigurable robot identification based on motion data, Meta Self-Modeling, and Egocentric Visual Self-Modeling. All of these employed autonomous learning paradigms to strengthen robot motion generation and control — foundational to producing autonomous, adaptive, and task-generalizable robots.

Hu's research trajectory — from self-modeling, to facial robots, to aesthetic perception and subjective intelligence — forms a complete technical system: an underlying model framework that enables robots to understand themselves, understand others, and possess self-learning capabilities. This is the solid foundation for the delicate, vivid expressions of AheadForm's bionic emotional robots today.

When Robots Bid Farewell to "Rule-Based Puppets"

And Advance Toward Generalizable Natural Expression

Following the Science Robotics cover article, MIT Technology Review conducted an exclusive interview with Hu, noting: "The fundamental theoretical limitation of existing rule-based robot lip-syncing methods is their assumption of a stable, one-to-one mapping between phonemes, words, and mouth shapes — an assumption that does not align with actual human speech mechanisms. In reality, the lip movements of the same phoneme vary significantly across different speakers, speech rates, emotional states, and contexts, with their timing, amplitude, and form being highly continuous and context-dependent. For example, when a person says 'hao' in excited versus calm states, despite the identical phoneme, the amplitude, speed, and form of lip movement may differ dramatically. Discretizing this into fixed phoneme-to-viseme rules inevitably loses this continuity and co-variation. Moreover, rule-based methods struggle to extend to multilingual, dialect, or singing scenarios, often requiring massive manual rule redefinition. Once robot hardware structure or materials change, original rules become nearly impossible to reuse and require complete reconstruction. More critically, rule-based systems cannot model the nonlinear, parallel-driven, and cross-temporal dependency relationships that lips as soft organs exhibit during speech. They can therefore only generate 'correct but rigid' mouth shape sequences, failing to produce lip movements that are naturally coherent in both time and form. By contrast, data-driven methods can learn these complex statistical patterns and implicit constraints from real human and robot speech data, fundamentally breaking through the bottlenecks of rule-based methods in generalization, scalability, and naturalness."

This paradigm difference extends beyond the lips; the robot's overall facial expression system faces the same challenges. Moving from "being able to move" to "moving reasonably and naturally" to "moving with aesthetic richness" represents an exponential increase in complexity, confronting diverse challenges and problems.

Digital Intelligence Moving Toward "Presence and Companionship"

Since the Spring Festival Gala, "humanoid robots" have become a high-frequency keyword in national discourse. For the first time on this mass stage, people collectively witnessed various robots demonstrating locomotion and manipulation capabilities, and embodied intelligence — previously confined to capital and tech circles as a frontier topic — gradually entered everyday public imagination and discussion.

In 2026, embodied intelligence startups as a whole have entered deep waters, with capital and talent increasingly concentrating at the top. AheadForm has established a distinctive technical path and industrial positioning in the bionic emotional robot domain, and has accelerated on both capital and market fronts. Since the second half of 2025, the company has completed five consecutive funding rounds, receiving strong support from leading internet companies, industrial capital, and top-tier financial institutions. Its products have gained phenomenal attention online, with domestic and international users highly anticipating its mass production timeline and commercialization path — discussions around "stock ticker symbols" have frequently emerged. Founder Yuhang Hu (online handle: U-Hang) has cultivated a unique "electronic shareholder" community of followers, with substantial user-generated content continuously spreading, amplifying brand influence and market expectations in tandem, and forming an important舆论 and user foundation in the process of digital life becoming physical.

In real-world scenarios, an offline collaboration with NetEase's Justice mobile game allowed the robot to demonstrate stunning performance in high-density interactive settings, rapidly triggering online buzz and widespread sharing; the Zhiyuan Spring Festival Gala singing performance further showcased refined, natural, aesthetically compelling落地 interaction; the Origin F1 robot has accumulated over 300 million plays, with "the entire internet writing scripts for U-Hang's robot" hitting trending list TOP1, once again igniting nationwide attention and validating its strong appeal in mass communication and user co-creation. More application cases to be revealed in 2026 will further bring the "dreams" and "scenes" that once belonged to science fiction into reality — the physicalization of artificial intelligence will be the next transformative leap for the entire AI industry, converting intangible software systems into presences with emotional expression and genuine interactive capability.

AheadForm's mission is to push AI from "language and interaction" toward "authentic presence and long-term companionship" — to evolve cold digital code into the first form of emotional intelligence.

5Y Capital seeks out, supports, and inspires lonely entrepreneurs, providing them with support from the spiritual to all operational aspects.

We believe that if the you whom others see as crazy begins to be believed in, the world will become a different place.

BEIJING · SHANGHAI · SHENZHEN · HONG KONG