GTC Recap: When "Everything Can Be Tokenized," Who's Defining AI's Body and Memory | Yunqi Capital

云启资本·March 20, 2026

Let's talk about something beyond tokens.

At this year's GTC, one buzzword that broke through was "token."

Unlike past discussions that mostly stayed in the technical weeds, this time token was placed in a much broader context. It's becoming the fundamental unit of measurement in the AI industrial system. In a sense, this also reflects how AI competition is shifting from "model capability" to "system capability."

And once capabilities can be quantified and scaled, the questions themselves begin to shift: AI is no longer just about "how good is the content it generates" — what matters more is "can it run stably in the real world."

So the threshold at this stage actually comes back to a few simple but critical dimensions: executability, reliability, and contextual capability.

At this year's GTC, Yunqi portfolio companies X Variable Robot, DeepRoute.ai, and Zilliz appeared in various forms, each landing on one of these three dimensions and providing more concrete footnotes to this shift. In this "Yunqi Capital" feature, we share their stories.


Executability

AI Starts Actually "Doing Things"

In his GTC opening keynote, Jensen Huang unveiled NVIDIA's AI ecosystem partner program. X Variable Robot, as the only company from China, appeared alongside Physical Intelligence, Figure, and Skild AI in the "AI for Robotics" segment.

This placement sits closer to the "model and capability definition" layer in embodied intelligence, rather than specific applications or system integration.

Making it into this ecosystem segment itself means that X Variable's technical path and capabilities in embodied intelligence models have entered the global mainstream technology视野.

From a technical path perspective, X Variable represents a class of model frameworks with VLA (Vision-Language-Action) at their core, attempting to unify perception, understanding, decision-making, and execution within a single system.

**The goal of these models isn't complicated: to transform AI's output from "information" into "actions that can be executed."

**In this sense, the importance of embodied intelligence lies not just in the robotic form itself, but in that it provides a new capability boundary: AI is beginning to gain executability in the physical world.

Learn more here


Reliability

From "It Runs" to "You Can Trust It"

**If executability answers "can AI do things," then a harder question in the real world is: can these capabilities be trusted over the long term?

As urban NOA (Navigate on Autopilot) penetration continues to rise, assisted driving is moving from "available" to "widespread," but users' actual willingness to use it hasn't kept pace.

This means the industry has crossed the "does the technology work" phase and entered a new zone where it needs to be stable enough to be relied upon.

Against this backdrop, at this year's GTC keynote, DeepRoute.ai CTO Tongyi Cao shared the practice of reconstructing the assisted driving system using Foundation Model.

The key isn't just model scale (40B parameters), but also unifying multiple capabilities — understanding traffic scenarios, executing driving decisions, evaluating driving behavior — into the same model. In other words, the system is no longer just "making a move," but beginning to possess understanding and verification of its own decisions.

Another critical change happened in system iteration: the data closed-loop cycle shortened from roughly 5 days to 12 hours. This significantly boosted R&D efficiency.

Meanwhile, this system is already operating at scale. To date, DeepRoute.ai has mass-produced and delivered over 250,000 units, and is pushing toward the million-unit scale. These numbers themselves are proof of system reliability.

Learn more here


Contextual Capability

Where Is the Ceiling of AI Systems

**At this year's GTC, when showing the data infrastructure landscape, Jensen Huang said: "Unstructured Data is the Context of AI."

Unstructured data includes text, images, video, logs, sensor signals. They're no longer just stored historical assets, but are becoming the core context for AI system operation.

But here's the problem: this data has long existed, yet has been difficult to use effectively. It can't be indexed, is hard to retrieve, and struggles to participate in real-time decision-making.

This is also why on this landscape, Milvus (Zilliz), representing vector databases, is placed in a more critical position.

**At its core, it solves one thing: making unstructured data "usable."

From this perspective, the importance of vector databases isn't just "faster search," but that it changes how data participates in AI — from static storage to real-time callable context.

But as AI applications shift from single calls to continuous operation, these problems are also escalating: data scale is rapidly expanding, cost and latency are becoming bottlenecks, and online services and offline optimization are increasingly fragmented.

**In this context, Zilliz proposes AI Lakebase as an infrastructure upgrade direction for unstructured data workloads. It attempts to solve a more core problem: integrating storage, retrieval, and computation into a unified foundation — without moving the data.

Learn more here

From embodied models, to intelligent driving systems, to data infrastructure, AI is moving toward a more complete system form. More innovation is happening.