
Which Path Leads to the "World Model" Endgame? | A Conversation with Biwei Huang, Founder of Aether AI
June 18, 2026
🚥 Someone is always pushing toward the stars.
World models are one of the hottest keywords in AI and embodied intelligence for 2026. But the more popular a term becomes, the more it's overused: video generation, 3D generation, JEPA, VLA, WAM — all seem to get labeled "world models." Yet when we actually talk about "world models," what exactly are we talking about?
This week on Crossing, we're joined by Professor Biwei Huang, founder and CEO of Aether AI. From the Max Planck Institute in Germany to CMU and then UCSD, she has consistently focused on causal discovery and causal AI, making her one of the field's important academic contributors.
Aether AI recently closed a rapid $20 million angel round. Huang is bringing causal AI — long considered "more principled, but harder to implement" — into the problems of Physical AI and robot brains, building what she calls a "causal world model."
Biwei and I discussed the definitional boundaries of world models, why video generation doesn't equal world models, why VLA hits walls on real-world tasks, why WAM may only be an intermediate state, and the three core problems that "causal world models" aim to solve.
This is also an episode about the choice to start a company. Why would a UCSD professor decide to jump into entrepreneurship in early 2025? What signals did she see?
If you're following world models, embodied intelligence, robotics, causal AI, or thinking about where the next AI paradigm will emerge, this episode might help you recalibrate one question: which path actually leads to the endgame for "world models"?
🎬 Our video podcast is now live on @Koji Yang Yuancheng's WeChat Channels, Douyin, Xiaohongshu, Bilibili, YouTube, and other platforms.
📒 The transcript will be published on the @CrossingCrossing WeChat official account.
🟢 🔴 🟡
🟢 00:43 Lightning Round: Academic background, MBTI and zodiac sign, one-sentence intro to Aether AI and its product, funding details, pre-startup experience
🟢 02:13 Three Paths to World Models, and the Fourth That Nobody Mentions
Video generation, 3D generation, JEPA — all being called world models. But when we say "world models," what are we actually talking about?
World models — a term that sounds grand yet gets used imprecisely. Is it a serious technical object, or just a buzzword being thrown around?
"Not those three paths. The fourth one — the one we're building."
🟢 04:49 What Makes a Causal World Model Different
A true world model needs to learn three things simultaneously in latent space — which three?
It's all AI, so why can LLMs be logically rigorous without understanding causality, while world models supposedly can't do without it?
Why did LLMs succeed so spectacularly only in natural language and coding?
🟢 10:33 The First Version of a Causal AI World Model
On a scale of 10: VLA ceiling at 5, WAM at 6.5. The causal path... what score does she dare give?
To train the first version: how many hours of data, how many GPUs?
Four data types in the mix — simulation, ego-centric, video, teleoperation. Which one takes 80%, which gets only 20%?
WAM is just an "intermediate state" — better than VLA, but why is it destined to fall short of the finish line?
🟢 16:17 The "Three Kingdoms" of Causality
Three schools, three titans in their eighties, who reportedly "didn't acknowledge each other" in their younger days — what kind of intellectual world was this?
Where exactly did Turing Award winner Judea Pearl and Harvard's Donald Rubin disagree?
Having trained at CMU, where does Biwei Huang position herself among these three factions?
🟢 22:32 A Brief History of Causality
From Aristotle and the I Ching, to clinical double-blind trials, to an algorithm from three professors at CMU in the late 1980s.
Experiments are expensive, and often impossible to run. So scientists turned to "observational data only" — how did this path become viable?
Biwei Huang's most core contribution over the years: doing causality in an "imperfect world" full of latent variables, bias, and missing values — why is this actually the hardest?
🟢 20:35 Causality and Large Models
In the past, causality helped LLMs in only two ways — internal and external. What do they look like?
Have OpenAI, Anthropic, Google actually shipped causality into their products?
Two entrepreneurial choices lay before her.
🟢 41:08 Is a PhD Still Worth It?
Join OpenAI for $30 million a year — should you still quiet down and do research at that point?
What kind of person should do a PhD?
How do you tell if your desire for research is "real," or if you just want the degree?
The wall between industry and academia is getting lower — why is this good news for people who are torn?
🟢 47:17 Looking Back from Five Years in the Future: What's Wrong Today
If she could ask an omniscient God one question, it would be whether "causality actually exists."
VLA isn't the endgame, WAM isn't either — but what does each leave behind?
If causality doesn't exist, "we'll collectively plunge into an existential crisis of gigantic proportions."
Subscribe to Crossing: 🚦 We track the industry shifts and new entrepreneurial opportunities brought by the new wave of AI technology.
🚦 Crossing is Steve Jobs' metaphor for Apple — standing at the intersection of technology and liberal arts, where great products are born. AI is transforming every industry. We seek out, interview, and bring together a new generation of AI entrepreneurs and active builders in the AI era. Together with them, we explore and embrace the changes, the new possibilities.
👦🏻 Host Koji: I founded Crossing, started AI Hacker House — a community space for the new generation of AI entrepreneurs — and serve as Venture Partner at ZhenFund. I believe technology, especially AI, represents the greatest value-creation opportunity of our generation. Koji on Jike, Koji's website