Understanding the Metaverse Through the Lens of AGI: A Spiritual Home for Human-Machine Symbiosis and Co-Creation | Gaorong "Future"

高榕创投高榕创投·February 23, 2022

AI agents may well be among the first native inhabitants of the metaverse.

Companies across every industry are racing to stake their claim in the metaverse, hoping to build the lifestyles of tomorrow's digital world.

In early February, Gartner released a report predicting that by 2026, 25% of global users will spend at least one hour daily working, shopping, learning, socializing, or entertaining in the metaverse. It also forecasted that 30% of enterprises worldwide will have launched products and services there by then. But Gartner cautioned that metaverse technologies remain nascent and highly fragmented. In the fifth installment of Gaorong Ventures' "Future" series, Yuan Quan, CEO of Qiyuan World, offers his perspective on understanding the metaverse through the lens of artificial general intelligence.

Powered by AGI technology, Qiyuan World has been steadily developing intelligent agent products — neural network-based entities capable of self-perception, autonomous learning, and decision-making, designed to inspire and accompany humans. In June 2020, Qiyuan World's StarCraft II AI agent defeated a Chinese professional champion-level player 2:0 in a man-machine showdown. Yuan believes intelligent agents may become among the first native inhabitants of the metaverse, coexisting and co-evolving with humans in the digital realm, sparking creativity in each other. The following is Yuan Quan's account:

Despite all the debate surrounding the metaverse, the concept itself is genuinely valuable — a crucial technology direction facing the future, where the internet, AI, and immersive XR experiences converge. Of course, at this stage, whether measured by technological maturity or the pace of industrial adoption, the metaverse remains relatively early. The internet, after all, took decades from emergence to mass adoption. But we believe that with the exponential progress of related technologies, the metaverse will deliver tangible experiences within the next 5-10 years. Why do we say this? Looking back at the timeline: in 2012, while at Alibaba, I worked on recommendation algorithms for mobile Taobao and mobile Tmall, witnessing the algorithmic upgrades during the transition from PC internet to mobile internet. The PC internet era arrived around 2000; the mobile era began in 2012. If twelve years marks one upgrade cycle for the internet, then today we have indeed reached the inflection point for evolving to the next generation of digital worlds.

In this wave, people across the globe committed to exploring digital worlds are leveraging their inherent strengths and most capable segments, approaching from different angles to build products, businesses, and ecosystems — a hundred flowers blooming. Meta entered from social, Microsoft from enterprise, Roblox from gaming. Additionally, we need to more deeply understand what "exponential technological progress" means. In recent years, we've already witnessed exponential advances in AGI. In cognitive decision-making, AlphaGo defeated Lee Sedol in 2016 in a game of perfect information — a problem humanity once couldn't solve. Yet just five years later, looking back, this has become the simplest of decision problems.

In 2020, DeepMind was first to defeat a professional champion in StarCraft II; half a year later, Qiyuan World's StarCraft AI also beat a Chinese champion. I still remember 2017, when we first began training AI in StarCraft II, starting from a single marine and a single zergling. Limited by algorithms and compute back then, it sometimes took an entire night just to learn one skill. Today, it takes merely hours or even minutes to see clear AI growth curves.

Metaverse-related technologies, hardware, and ecosystem building are also advancing exponentially. Global VR headset shipments have reached tens of millions. I believe in another 3-5 years, the VR experience will leap forward noticeably. We also look forward to exploring the metaverse — the most exhilarating endeavor of the next 5-10 years — through our research on AGI and intelligent agents.

Over the past few years, Qiyuan World has been elevating three capabilities of intelligent agents. In our first three years, our core technical focus was on maximizing IQ capability, which is why we chose StarCraft II — a competitive game — to train our AI, validating in the most complex strategic decision-making scenarios that intelligent agents could dramatically surpass human intelligence.

Beyond that, we want intelligent agents to effectively convey emotion, interaction, and social connection to people, which requires EQ capability — an area our EQ engine is actively tackling. Currently, Qiyuan's intelligent agents can already engage in preliminary language interactions with humans, including dialogue and content generation. Third, we're enhancing the interpretability of human-agent interaction. Though deep learning remains relatively black-box, increasingly more methods and techniques are emerging to externalize and display an agent's learning processes and capabilities to some degree, helping establish trust between AI and various industries as they adopt intelligent agents.

As intelligent agents' IQ, EQ, and interpretability continue improving, they will become crucial components of the metaverse — even new species and native inhabitants of virtual worlds. When IQ surpasses human capability, they can assist with learning and training. On the EQ dimension, through computable means, they may care for people better than humans themselves, providing companionship and warmth — a core experience in the metaverse.

Beyond building beautiful digital experiences, the greater value lies in empowering the physical world through the metaverse, digital twins, and AI's ability to transfer between virtual and real. For example, migrating neural networks to home service robots, industrial robotic arms, or quadruped robots — whether accompanying elderly family members and children, improving manufacturing efficiency in industrial settings, or performing dangerous work humans cannot do — would dramatically enhance real-world efficiency and experience.

Qiyuan World has spent the past two years using intelligent agents to control indoor autonomous robots at CVPR competitions, winning the 2021 robot visual navigation challenge without localization. We've also recently partnered with industry players to train AI for robotic or autonomous driving decision-making and control within digital twin environments.

Thus, the metaverse extends far beyond digital entertainment. It should be a spiritual home where humans and machines coexist and co-create, sparking and challenging each other. Speaking broadly, the most exciting prospect for the future metaverse is the organic collaboration between silicon-based life powered by AI and carbon-based life that is humanity — interacting and integrating, accompanying each other's growth, and pioneering entirely new learning and living experiences.

The metaverse is a systematic engineering challenge with three foundational layers to tackle.

Layer One: Metaverse Hardware Platform

Just as smartphones became the dominant hardware platform of the mobile internet era, the metaverse era will require its own portal hardware — whether VR, AR, or XR. Whichever category becomes mainstream, an entirely new hardware platform will serve as the gateway to the metaverse. 5G also provides assurance for transmission efficiency and quality on this platform.

Layer Two: Metaverse Operating System

What will the metaverse operating system look like? This is a profoundly valuable question that demands first-principles reasoning. It may differ from PC and mobile operating systems, which primarily handle resource and compute scheduling, by possessing the ability to create upper-layer applications itself. This requires three key capabilities. First, large-scale cloud-based infrastructure. Second, real-time rendering capability that creates immersion. Third, allowing users to generate enormously rich content through UGC or AIGC — this is where the metaverse will gain its vitality.

Layer Three: Metaverse-Native Applications

We've also been contemplating what metaverse-native applications will be. Looking back, portals and QQ were native to the PC internet era; WeChat, DiDi, and Meituan were native to mobile, combining mobile attributes with user needs. So on metaverse hardware and software platforms, what will be the most valuable native applications? This represents enormous opportunity. Additionally, what should a Chinese-style metaverse look like? What should these native applications become when combined with AI and the joy of human-machine interaction? Including the search, recommendation, and advertising we used to work on — could all metaverse search, recommendation, and advertising be rebuilt from scratch? The questions are endlessly fascinating.

Some ask how humans will navigate between virtual and physical worlds, living more coherently across both. The metaverse may lead us toward a state of virtual-physical symbiosis. Digital twins, for instance, synchronize critical signal data from the physical world to the virtual through numerous sensors, where efficient computation and decision-making feed back into reality. Wearing XR devices in the physical world overlays the virtual upon it. In the future, it may become difficult to distinguish virtual from real. Because the experiences are all real, fixating on the virtual-real distinction matters less than what you can experience and what value you can create.

As Elon Musk has said, perhaps humanity itself lives inside a vast, sophisticated computer simulation. Technology is advancing so rapidly that the boundary between virtual and physical worlds will soon blur. Of course, a good world — whether virtual or real — must be one where people can fully enjoy experiences, generate creativity and inspiration, and be stimulated to improve and grow.

We hope the metaverse and AGI can create a more beautiful, more creative environment for everyone, and we deeply believe in this. Moreover, whether in the metaverse or the physical world, there is much that is beautiful; everyone can find their own comfortable, wonderful corner.

As intelligent agents increasingly interact and collaborate with humans, numerous legal and ethical issues will arise, requiring policy and legislative frameworks to ensure the digital world advances in ways most beneficial to human society.

To indulge in one final speculation: currently, top researchers including DeepMind scientists are already studying AI consciousness. So within our lifetimes, we should see not only people accessing the metaverse through hardware devices, but potentially human thought and consciousness achieving digital immortality within it. The ultimate proposition of the future digital world is the symbiosis of AI and humanity — a more fundamental and enduring question for the decades ahead.