"Independent Variable Robotics" Raises 1 Billion Yuan in A++ Round | Yunqi Capital

云启资本·January 12, 2026

Building Foundation Models for the Physical World

Today (January 12), Yunqi Capital portfolio company X Variable Robotics announced the completion of a RMB 1 billion A++ round. The round was jointly led by ByteDance, HSG, Beijing Information Industry Development Fund, Shenzhen Capital Group, Nanshan Strategic Emerging Industry Investment, Xi Chuang Tou, and other institutional and regional investors. Yunqi Capital has backed X Variable since its Pre-A round.

This financing round once again demonstrates the industry's broad recognition of X Variable. At the same time, the company is making tangible progress in both embodied intelligence foundation model R&D and commercial deployment. In this edition of Yunqi Capital, we bring you the details.

The following is adapted from "X Variable Robotics"

This is also reportedly the first investment made by Shenzhen Capital Group's AI fund since its establishment. Notably, in addition to ByteDance, X Variable has previously received investments from Meituan and Alibaba — making it the only embodied intelligence company in China backed by all three of these internet giants.

This coordinated bet by cross-domain investors underscores both a collective market consensus on the importance of embodied foundation models and deep recognition of X Variable's technical leadership and growth potential.

Building a Foundation Model for the Physical World

Making Robots That Can Actually Work

Over the past two years, embodied intelligence has continued to capture market attention. The "body" — robot movement and control capabilities — has advanced significantly. The competitive focus has shifted from "limbs" to "brain." The key breakthrough lies in building an intelligent "brain" that can understand the physical world, manipulate objects, and flexibly adapt to complex, dynamic scenarios — enabling robots to truly perform diverse real-world tasks.

An embodied intelligence foundation model is a physical-world foundation model that operates independently of and in parallel to virtual-world foundation models such as large language models and multimodal models. The core of a foundation model lies in breaking through bottlenecks in generalization and versatility. The complexity of physical reality demands that robots process unstructured, dynamic, and stochastic tasks in real time. X Variable's embodied foundation model takes all robot sensory information — such as vision, touch, and voice — as input and directly outputs robot actions, vision, and language.

Qian Wang, founder and CEO of X Variable Robotics, stated: "The next stage of competition in embodied intelligence is fundamentally about foundation models built on data closed loops and the models' capacity for self-improvement." Under this thesis, the world is accelerating investment across data, models, and compute to rapidly advance embodied intelligence.

Deep Fusion of VLA and World Models

Real-Robot Reinforcement Learning for Autonomous Evolution

X Variable's self-developed WALL-A model features a pioneering architecture that deeply integrates VLA (Vision-Language-Action) with world models. As a native multimodal input-output architecture, WALL-A is the first to achieve embodied multimodal chain-of-thought reasoning. WALL-A leverages world model mechanisms for spatiotemporal state prediction, collaborates with visual causal reasoning to understand environmental feedback, and internalizes physical commonsense from data through a learnable memory mechanism.

This fusion mechanism significantly enhances robots' zero-shot generalization capabilities when performing mobile manipulation tasks in unstructured environments.

Meanwhile, through large-scale real-robot reinforcement learning, the foundation model further acquires high-quality learning experience through interaction with the real physical world, autonomously solving long-tail problems and enabling continuous capability evolution.

X Variable has built a closed technical loop of physical-world foundation model to real-robot autonomous evolution through a fully end-to-end technical approach.

High-Quality Real-Robot Data

Building the Model Evolution Engine

Data is the core fuel for foundation model evolution. Since its founding, X Variable has made heavy investments, adhering to a closed-loop iteration of hardware, data, and models.

As one of the earliest companies in China to scale real-robot data collection, X Variable has self-developed multiple data collection devices including master-slave teleoperation, exoskeletons, and body-free systems, achieving data validation and model breakthroughs across all device types.

The company has also built a model-driven data pipeline that continuously produces scaled, high-quality data through data generation, filtering, augmentation, and annotation.

X Variable insists on using its foundation model to provide feedback to every stage of data processing and hardware design, iterating toward higher-quality data and more efficient collection devices, thereby further improving foundation model performance.

Model Iteration Driving Capability Leaps

Autonomously Completing Tasks in the Real World

The continuous evolution of models has given X Variable's robots remarkable adaptability in real-world scenarios.

As the first example globally of a physical-world foundation model-based robot successfully crossing both outdoor and indoor environments, the robot in food delivery and cardboard recycling tasks — facing strong wind interference or visual occlusion — relies on the foundation model's generalization capabilities and the world model's causal reasoning. It can not only mentally reconstruct the full shape of occluded objects like a human would, but also autonomously correct errors through reinforcement learning strategies when stuck, completing the task closed loop without human intervention.

This adaptability also manifests in complex, challenging real-world logistics scenarios. Facing chaotically stacked packages, the robot identifies irregularly shaped items through the foundation model's zero-shot generalization capabilities and quickly adapts to working rhythms using reinforcement learning.

Worth noting, the evolution of X Variable's foundation model has also unlocked the potential of high-DOF dexterous hands — **the robot has autonomously mastered human-like skills such as in-hand reorientation, from using tools to dealing cards, an extremely fine motor task demanding precise fingertip force control, successfully conquering the final centimeter of embodied intelligence fine manipulation.

While continuously pushing technical boundaries and focusing on frontier exploration, in September 2025, X Variable also open-sourced its self-developed end-to-end embodied foundation model WALL-OSS, promoting open access to embodied intelligence technology.

From Full-Stack Self-Development to Multi-Industry Deployment

Unlocking the Key Path to Model-Driven Commercialization

X Variable adheres to full-stack self-development of both software and hardware. Starting from model algorithms and data-driven requirements, it has deeply defined robot hardware architecture, designed and released two high-performance robot bodies — "Quantum One" and "Quantum Two" — and simultaneously achieved comprehensive self-development of core components including robotic arms, joint modules, power drivers, and main controllers, with deep algorithmic adaptation. This has driven substantial reductions in overall system cost**, laying a solid foundation for scaled production and commercial deployment of embodied intelligence robots.

Currently, X Variable has gradually entered multiple high-value sectors including industrial manufacturing, logistics, and elder care. Cross-industry applications demonstrate that X Variable's robots are precisely meeting real market demands with high generalization and low-cost deployment capabilities.

Qian Wang has stated on multiple occasions that in the frontier track of embodied intelligence, one should strive to be a leader, not a follower. X Variable continues to deepen its work across three core domains — model iteration, data pipelines, and robot hardware — and through solid technical accumulation and full-stack self-development capabilities, constantly pushes beyond existing capability boundaries.

Going forward, X Variable will continue to use leading model capabilities as its lever to unlock the deeper forces of embodied transformation, fully releasing the technical value of embodied intelligence in industrial applications, driving scaled application of model-driven embodied intelligence across thousands of industries, and injecting new momentum into industrial upgrading and productivity leaps.