From 3D in Life, to Life in 3D: Erasing the Boundary Between Reality and the Virtual | 5Y View
What makes 3D so captivating?



Yu Cheng
Partner, 5Y Capital
Today's recommended read comes from XVerse, on the trends and future of 3D content. Thanks to its immersive, interactive, efficient, and emotionally resonant qualities, 3D has already been widely explored and applied in our daily lives. Whether it's 3D in life or life in 3D — where does the appeal of 3D come from? This piece from XVerse offers a compelling perspective.
Once upon a time, 3D existed within life. From 3D design software and 3D games on computers to 3D renderings, and from Avatar and Up on screens a few years back to the recently released Free Guy — these were all early examples of 3D technology applied to everyday life.
Now, life exists within 3D. During the pandemic, top AI conferences like ICML, ICLR, and CVPR all moved online. One PhD student from Florida International University's computer science department decided to host the inaugural ACAI conference inside the game Animal Crossing. This 3D AI conference not only accepted papers across multiple topics but also gave each speaker 15 minutes for their presentation and 5 minutes for Q&A — remarkably true to form.
The only drawback was the in-game player limit: a maximum of five audience members, with everyone else watching via Zoom livestream. After their talks, speakers had to vacate their spots and retreat to other 3D islands for tea breaks.


I. 3D Everywhere
Across more and more industries, 3D is gradually becoming a foundational general-purpose technology.
In primary and secondary industries — from agriculture and manufacturing to construction — information exchange and sharing platforms built around 3D simulation models have emerged, offering advantages in visualization, parametric design, interactivity, and full lifecycle management. Everything from a plot of winter wheat to a factory floor, a skyscraper, or an entire city can be digitized and virtualized, with real-time status visualization and intelligent, collaborative management decision-making.
Agriculture: Researchers at the Chinese Academy of Agricultural Sciences developed a 3D system based on winter wheat that can precisely simulate the growth processes and morphological structures of major field crops like wheat, corn, and rice under various conditions, enabling data analysis and visualization of yield and cultivar adaptability.

Manufacturing: Gree Electric's K3106 assembly line in Zhuhai achieved full-process digitization from product design and production planning to manufacturing execution through 3D simulation.

Smart Cities: "Digital twin" technology has been written into China's 14th Five-Year Plan and 2035 long-range objectives, supporting the development of Digital China.
Construction: Building Information Modeling (BIM) enables rapid collaboration among designers, engineers, and construction managers, reducing rework and improving construction deployment.
In tertiary industries — retail, culture, education, and entertainment — 3D's inherent qualities of virtual reality, real-time interaction, lean efficiency, and emotional resonance allow it to both faithfully represent and transcend reality, making it widely explored and applied.

Luxury Retail: Dior used 3D technology to virtually recreate its newly opened flagship store on Paris's Champs-Élysées, allowing customers to browse 360 degrees from home and experience the seasonal Parisian aesthetic.

Balenciaga moved its Fall 2021 fashion show to a browser-based 3D environment, where photorealistic, atmospheric cities, mountains, and wastelands became an infinite runway. For its Spring/Summer 2022 show, the brand used planar tracking, rotoscoping, and 3D modeling to "clone" supermodel Eliza Douglas's face onto multiple models, giving each distinct hair and makeup look the same apocalyptic expression. Gucci has also launched virtual experience stores and virtual bags; its virtual sneakers sold for just $12.99 and could be matched with users' photos and videos via AR.
Film: Avatar, which set the benchmark for 3D visual effects in 2010, will release its sequel in 2022 using new glasses-free 3D technology to present an extraordinary underwater world — director James Cameron wants audiences to remove their 3D glasses.

Entertainment: Major artists from Travis Scott to Ariana Grande have held large-scale virtual concerts in Fortnite, allowing audiences to transcend time and space, create their own avatars, enjoy unrestricted viewing choices, "meet" their idols up close remotely, experience ultimate audiovisual effects, and enter music worlds beyond imagination.

Gaming: The demo video for Black Myth: Wukong showcased exceptional 3D graphics, including lighting effects (dappled leaves and shadows, puddle reflections and glints), fluid simulation (heavenly palace mist and ground-level water flow), detailed environments (leaf veins and moss textures, carved window frames), and character modeling (hair, wrinkles, pores) — all rendered with remarkable naturalism, raising expectations for the future of Chinese AAA titles.
II. The Unstoppable Rise of 3D
Whether it's virtualizing reality (3D in life), virtualizing reality (life in 3D), or the blurring of virtual and real altogether — where does 3D's appeal come from?
Largely from 3D's inherent status as the most powerful tool for simulating worlds.
This manifests in three ways:
First, spatial simulation. Adding depth to two-dimensional planes enables three-dimensional, all-around display and interaction of objects, characters, and scenes with effects resembling the real world. Furthermore, complex motion can be layered into space to create dazzling results — like the opening car chase in Free Guy, simultaneously photorealistic and viscerally dynamic, effectively amplifying content's expressive power and impact.

Second, character simulation — the high-fidelity replication of human body movements and facial expressions — something impossible to achieve on a 2D plane.
Third, physical property simulation, capable of modeling force feedback, collisions, explosions, fabric flutter, liquid flow, and other physical characteristics to lend actions authenticity. 3D rendering can also accurately simulate lighting effects, producing natural illumination and delicate textures.
This inherent advantage virtually guarantees the unstoppable destiny of 3D as an information upgrade.
Looking back at history, information has evolved through three stages: oral, written, and digital. In the oral era, human communication was confined to face-to-face scenarios. The written era allowed information to transcend time and space, extending human language but losing sound and vision — two biological components of communication.
In the digital era, the internet became the driver of continuous information upgrades: from one-dimensional telegrams and email, to two-dimensional images and sound, then to 2.5-dimensional video combining images, sound, and time sequences. Now, a 3D information era integrating text, vision, and sound while breaking constraints of time, space, and even physical rules is poised to emerge.

But do we have the capacity to produce massive amounts of high-quality 3D content?
Traditional 3D film production has two main methods. One is dual-camera live shooting, but its applicability is narrow and costs are high. The narrow applicability stems from the inability of dual-camera setups to handle distant views, focus pulls, and strong lighting during shooting, imposing numerous restrictions on subject matter and camera work.
The other is shooting footage first, then converting to 3D format. This technology is now quite mature, but problems remain. A two-hour film requires extensive manual labor for frame-by-frame conversion, with the "image segmentation — stereo drawing — background inpainting — compositing and rendering" pipeline taking months. Domestic theatrical conversion costs are substantial — for reference, the 2012 re-release of Titanic cost $18 million to convert.

Traditional 3D modeling has three production methods. One uses 3D editing software, where basic geometric elements are manipulated through a series of geometric operations to build complex scenes — mainly used for virtual environment construction and 3D model reprocessing.
Second is instrument-based measurement modeling, primarily using 3D scanners to capture real objects and convert their立体彩色信息 into digital signals, ultimately outputting digital model files containing 3D spatial coordinates and color data for each sampled surface point, ready for secondary processing.
Third is image or video-based modeling, which recovers 3D geometric structure from 2D images using multiple multi-angle photos for automatic matching, decomposition, and stitching to reconstruct 3D structure, including mesh and texture information.
As we can see, 3D content has accumulated relatively mature production methods, but pain points remain: high hardware costs, long modeling cycles, and inability to efficiently achieve large-scale mass production.
III. The Future of 3D Content: Large-Scale Worlds, Immersion
Looking at history again, internet development has always been information-centric, threading through the entire process of information production, transmission, and consumption to drive efficiency gains, cost reduction, and capacity expansion. Predictably, the next-generation internet in the 3D era will continue to efficiently deliver high-quality information and push 3D toward mass adoption.
Judging by this developmental pattern, we believe future 3D content will trend toward large-scale worlds and immersion, requiring continuous evolution of multiple underlying technologies to meet future demands for intelligence and high efficiency.
(1) Large-Scale Worlds
This manifests at the level of virtual world spatial construction — simulating the vastness, complexity, and diversity of the world through expansive maps, large numbers of characters, and rich detail.
This involves multiple technical links in the pipeline, including: 1) art-side environment, object, and character presentation, from theme selection and scenario design to detail refinement; 2) using procedural generation to reduce file sizes, expand content volume, and enhance randomness, enabling large-scale, high-efficiency production of natural environments, residences, buildings, and factories conforming to appropriate era aesthetics, planning, and rules; 3) rendering and storage of massive model assets; 4) image codec technology integrated with graphics engines to improve rendering quality and fully utilize bandwidth; and 5) 5G and cloud computing to enable mass content distribution.

(2) Immersion
The classic psychology of experience book Flow noted that immersion occurs when people feel pleasure and satisfaction while focused on a current goal situation (created by designers), forgetting the real world situation.
It encompasses both sensory and cognitive experience. Amusement parks offering "being there," or KTV lighting effects that blur day and night, represent sensory stimulation. Meanwhile, intense concentration during a game of Go occurs when a person's skills match the challenge, producing cognitive experience. The technical requirements include:
1. Photorealistic visuals. Traditional terminal rendering, limited by personal computer GPU capabilities, still falls far short of photorealistic pixel precision. Thus we need to focus on upgrades to rendering modes driven by engine, algorithm, and compute improvements.
2. Intelligent NPCs capable of spontaneously producing sufficient, sufficiently natural content. In traditional game worlds, NPCs (non-player characters) are generally "puppets" running on predetermined programs — their dialogue and behavior cannot escape programmatic frameworks, making them appear rigid and dead, easily breaking participants' immersion. But reinforcement learning and natural language processing algorithms can create intelligent digital humans or NPCs with their own personalities, settings, and logic, capable of spontaneous, self-driven interaction with people, environments, and objects in virtual worlds to produce sufficient, high-quality content that draws participants deep inside.
3. Interaction methods. Previously, participants were confined to vertical screen displays and hardware control tools like keyboards, mice, and controllers. The dimensional upgrade from 2D to 3D allows participants to freely switch perspectives or experience next-generation human-computer interaction based on VR and AR, with more natural, intuitive operations and high-frequency intelligent feedback mechanisms that comprehensively enhance immersion.
4. Real-time performance. Ensuring high-definition, smooth visuals and immediate, natural interaction is prerequisite to immersion, because the technology faces scenarios with enormous data volumes, massive computational requirements, and need for instant feedback. Traditional computing methods face challenges with origin server and bandwidth pressure under high concurrency, plus storage and latency issues. Thus we must continuously monitor progress in communications infrastructure, cloud computing, and edge computing.
Of course, building on high-quality 3D content, how to sustainably improve PGC production efficiency and UGC creative enthusiasm through intelligent, efficient methods to form a sustainable content creation ecosystem; and how to use underlying technology普及 to substantially enhance user experience and advance application depth — these will be directions for sustained industry effort.
With the turn of the year, all things renew. A future of infinite possibilities is arriving, and we hope everyone can freely "Redefine Your World" within it.




5Y Capital (formerly Morningside Venture Capital) currently manages approximately RMB 32 billion in dual-currency USD and RMB funds. 5Y Capital seeks out, supports, and inspires lone entrepreneurs, providing support from the spiritual to all operational aspects. We believe that if the "crazy you" in others' eyes begins to be believed in, the world will become refreshingly different.
BEIJING · SHANGHAI · SHENZHEN · HONG KONG

