After Running America's First Fully Driverless Truck Delivery, He Says He Wants to Claim the Right to Define Autonomous Driving | Linear Voice

线性资本·May 13, 2026

The real battleground for L4 is "no excuses."

Fully driverless Class 8 trucks — in a sense, Xiaodi Hou has pulled ahead of Elon Musk. One day before Musk announced the production version of the Semi was rolling off the line, Hou led Linear Capital angel-round portfolio company Bot Auto to complete America's first fully autonomous commercial trucking delivery — no safety driver, no remote teleoperation backup.

What we're sharing today is Hou's recent interview with Autoweek. In his view, too many companies are still building autonomous vehicles as tech products rather than industrial products.

In autonomous driving and even physical AI, what matters is whether unit economics work. Once they do, can you scale them? Only with scale can a truly great company emerge.

In today's autonomous driving narrative, every founder seems to be fighting for the right to define success.

On April 30, the day before Musk announced the production Semi, Hou — who, like Musk, has rooted his autonomous trucking operation in Texas — led Bot Auto to complete America's first fully unmanned autonomous trucking commercial delivery. No safety driver in the cab. No remote human operator providing real-time backup. A complete, real-world freight flow from loading to transport to unloading.

But this time, Hou doesn't want to be merely the person who moved faster than Musk. He wants a chance to recalibrate the standards for what autonomous driving success actually means.

In Hou's view, the past decade has seen the autonomous driving industry operate without unified standards, with each company defining success on its own terms: measuring commercialization by OEM partnership count, equaling scale with fleet size, touting technical prowess through cherry-picked metrics like MPI, and attracting capital through novel concepts.

His core thesis: Any autonomous driving company that fails to reduce per-mile operating costs is running a scam — including ourselves.

This is Hou's first interview since leading Bot Auto to its fully driverless milestone in the US. Below is the full conversation between Autoweek and Xiaodi Hou.

Q: Congratulations on achieving America's first fully unmanned autonomous trucking commercial delivery in just 24 months — even beating Elon Musk, who announced the production Semi on April 30. What were you trying to prove to the industry and the market with this milestone?

Hou Xiaodi: For me, it's not really about proving anything to the world. That would be too cringeworthy — I'm in my forties. This was simply a necessary step on our path to ultimate success.

If I do have any selfish motive, it's that I want to use this opportunity to establish some definitions for the autonomous driving industry.

Right now, a lot of people's logic is strange: if you have a lot of vehicles, that's commercial operations; if you start charging fees, that's also commercial operations. Basically every company is defining commercialization by its own rules.

And precisely because of this — because we're in a stage without unified standards — the industry's marketing has become extremely chaotic. Everyone keeps throwing out sensational news. In the short term, this certainly brings PR and market advantages, but in the long run, it will backfire on themselves and damage the entire industry.

Q: What exactly do you mean by this chaos? Almost every L4 company is claiming they've entered the commercialization phase.

Hou Xiaodi: For example, some people say that only having lots of OEM partnerships counts as success, as commercialization. I say that's completely wrong. How many miles have you actually operated? How much money have you lost? Are you going to drag OEMs down with you?

Customer operations also vary in quality. If the customer is losing money and cancels the contract in a couple of days, that's your failure too. Conversely, if our own operations are profitable, that's success. Why must success be forcibly defined as customer operations?

There are also companies that say buying 200 vehicles and deploying a dozen routes equals scale.

So if there's one purpose to achieving this first delivery, it's to clearly define what real success actually looks like.

A reasonable definition should incorporate all information, all sub-items, leaving you no excuses. The simplest approach is to calculate all costs related to vehicle mileage accumulation, regardless of scenario or task, and divide by total vehicle miles.

This result is essentially cost per mile (CPM). I believe this is a relatively honest metric that's also harder to dress up.

Q: How is your fundraising going?

Hou Xiaodi: Startups are always fundraising, but over the years, there's been no shortage of rumors about us.

Some people said we were going to sell to Amazon. Others said we were going to acquire another company. There were even stories about me and our investors. Most of these rumors had no real source — they just mutated as they spread.

We're certainly always in funding conversations, and our capital chain has never broken. Things have kept moving forward.

Q: What's your ultimate definition of true commercialization for autonomous trucks?

Hou Xiaodi: I think autonomous driving companies — or more broadly, physical AI companies — really need to focus on just one question: Can our unit economics work? And once they do, can we scale them further?

You can't just talk about how impressive your technology is. What matters is when you can close the business loop.

So I think all the most complex and hyped topics in physical AI today boil down to something very simple in the end: can you prove you're making money, not losing it?

Q: You've publicly cited a $0.64 per mile cost figure. What's the exact calculation? What costs are included and what aren't?

Hou Xiaodi: By our internal full-scope calculation, last year we were still at over $3 per mile overall. This includes driver costs, autonomous driving kits, the vehicles themselves — all costs related to operations. Because the vast majority of miles last year were still driven with someone in the vehicle, driver costs absolutely have to be counted.

Our core logic is simple: if an expenditure scales proportionally with fleet mileage, it should be included.

For example, storage costs — higher mileage means more data, which naturally increases costs, so that's counted. But something like deep learning model training: once trained, deploying to 1 vehicle or 100 vehicles doesn't cause linear growth, so it doesn't belong in the mileage-amortized costs.

So what we end up calculating is all costs strongly correlated with fleet mileage, summed to get CPM.

Q: You've made a pretty aggressive claim: 30 vehicles might reach break-even, and 100 vehicles could potentially beat human CPM. How exactly does this underlying model work?

Hou Xiaodi: Several publicly listed autonomous trucking companies in the US have already disclosed some data. Aurora, Kodiak — they've published their cost of revenue or equivalent figures. But there's still a question: is the so-called cost they calculate actually full-scope? Some companies may not have included everything, so true costs could be even higher.

The reason CPM can be pushed down relatively low has a very core reason: the vehicles are actually operating, not sitting idle for long periods.

If you look at many peers' disclosed operations, you'll find their vehicle utilization isn't actually very high. Of course, these are their own audited numbers — they have no reason to deliberately underreport. So I think the key issue isn't just how many vehicles you have, but whether those vehicles are actually running.

Q: You've been emphasizing availability rate. Why is this term so important?

Hou Xiaodi: Many companies are still approaching autonomous driving with a tech product mindset, not an industrial product mindset.

Let me give a very simple example: a Windows laptop can certainly take photos, but you won't pull it out to snap pictures like you do with your phone, because the operational complexity and convenience are completely different.

Autonomous driving works the same way. If the system isn't designed to industrial operational standards — not refined for 20 hours a day, 7 days a week — and remains stuck in the daily-debugging tech product stage, vehicle utilization will definitely be low.

Once utilization is low, total cost divided by total mileage means per-mile operating costs naturally stay high. The industry's real problem often isn't that costs are too high, but that effective mileage is too low.

Q: Does removing the driver mean L4 commercialization is fully formed?

Hou Xiaodi: Removing the driver is a necessary condition for L4 commercial viability, but not sufficient.

Without removing the driver, you'll never make money. But removing the driver alone isn't enough. You also have to drive down unit economics to be cheaper than human drivers to even qualify for discussing scale. Only with scale can a great company possibly emerge.

So the real focus isn't just whether there's a driver in the vehicle, but whether you have the capability to run this system smoothly, stably, and at scale.

Q: Then what do you think of all the companies talking about scale while still stuck at the slogan stage?

Hou Xiaodi: The problem is, they haven't truly placed themselves in a real operational environment.

What really needs validation isn't how beautifully things run in the lab, but whether this system can actually generate productivity in the real world. Oil stains, severe weather, vibration, high temperatures, sudden temperature drops, humidity — these aren't edge conditions, they're reality itself.

You have to validate in such environments: can it actually scale, can it actually make unit economics positive per vehicle, can it actually be pushed to tens of thousands of vehicles where each one is profitable.

Many people just treat this as a slogan without really standing in that operational scenario to think through the problems.

Take F1 and rally racing as examples. The approach of many autonomous driving companies is more like F1 logic: pursuing extremes, imagining the environment as idealized. But the real world is more like rally racing — complex road conditions, harsh environments, more variables, completely different cost structures.

So I think autonomous driving today can no longer be talked about with the lofty F1 narrative. It needs to be evaluated in true rally racing conditions to see if it can actually hold up. Otherwise, what you're building may not be a product that can land, just a concept that looks pretty.

Q: Speaking of scale, what do you think of platooning? Can platooning solve the profitability problem for L4 autonomous driving to some extent?

Xiaodi Hou: I don't think there's anything inherently wrong with platooning. Honestly, if you look at it with that classic tech-elitist "ivory tower" mentality, you'll definitely think platooning isn't impressive, that the tech isn't sophisticated, and someone will always use that angle to attack it. But to me, none of that matters — if it makes money, it works.

The real question is whether platooning can actually make money. I'll admit that to some extent, platooning does solve a lot of small-scale or low-probability problems for autonomous driving.

But at least on American roads, platooning has a fatal flaw.

Over here, if multiple vehicles are running very close together and not allowing other cars to cut in — just do the thought experiment — you're already severely obstructing traffic. That's why many U.S. states explicitly prohibit this kind of tight platooning in their laws. If you insist on platooning, you can only do loose platooning, but loose platooning inevitably means other vehicles will cut in.

Q: You keep saying safety doesn't need to be talked about so much. In your view, is safety an engineering problem that's already been solved, or a scientific problem that still needs to be continuously conquered?

Xiaodi Hou: I agree with half of what you're saying: to me, safety is an engineering problem, not a scientific one.

But that doesn't mean safety has been completely solved. Not every company claiming to do L4 has actually solved safety. A company only crosses the first threshold of safety when it can remove the driver and still operate normally on public roads.

And it's not a one-and-done thing. Software versions change, hardware versions change, and we have to re-prove our safety all over again. What L4 companies actually lose sleep over every day is how to efficiently test within a continuously iterating system without sacrificing the comprehensiveness and rigor of safety. This problem is more central than outsiders imagine.

Q: You mentioned earlier that standards in the autonomous driving industry are missing. What do you think are the most misleading metrics being used right now? What private agendas are hiding behind them?

Xiaodi Hou: As I just said, the first one is this idea that "you're only successful if you partner with an OEM" — that's complete nonsense.

For the commercial闭环 of autonomous driving, OEMs only have two sources of value: first, the ability to manufacture vehicles at scale, meaning the capacity to build 2,000 vehicles a year, which is an OEM's unique capability; second, on top of providing that capacity, being able to slightly reduce the cost per vehicle.

Beyond these two advantages, OEMs have zero additional value. Everything else is downside.

The biggest downside is that OEMs iterate far slower than startups. So when you don't yet need that much production capacity, the earlier you partner with an OEM, the more money you'll lose.

Binding yourself to an OEM prematurely, before you need the extra capacity, is like putting handcuffs on yourself — you'll just keep losing more and more.

The second false metric is scale.

Talking about scale when your revenue-minus-cost is still a massive negative — the more I look at it, the more terrifying it seems. I genuinely don't understand why so many companies present it as an advantage. The more you deploy, the more you lose!

But most people have no basic financial literacy. They can't read a P&L statement. All they can understand is: this company deployed 100 vehicles, impressive; that one deployed 200, even more impressive. It drives me crazy.

The third is unconditional technological optimism.

The mindset of "I have nothing now, but technology will definitely break through in the future." That's also a terrible mentality. Deep learning exploded in late 2012. Right now, we have no idea whether we're in 2002 or 2011 of that historical arc.

Remember when ChatGPT first came out? Lots of people debated whether it had intelligence, said it was just information compression, even claimed compression is intelligence. Others said it could only process existing information on the internet and couldn't create anything new.

Look at Yann LeCun's early posts on X — he was questioning the technical level of large language models too. Even so, the technological breakthrough ChatGPT brought was much smaller than deep learning. And breakthroughs at that magnitude happen roughly once a decade — the last was 2012, this one was late 2022. I find it hard to believe that baseless technological optimism, watered by capital, can achieve anything in the next few years. I don't buy it.

Q: Beyond what you just said, what about MPI (miles per intervention), which the industry constantly talks about? What kind of metric is that?

Xiaodi Hou: MPI is problematic by definition — every number reported by each company is self-audited.

A company only reports something if they deem it unsafe and safety-related; if they decide it's not safety-related, they don't have to report it. So each company's MPI number essentially depends on that company's subjective safety threshold, and has almost nothing to do with actual technical capability.

Take California's most official MPI ranking — if you read the original definition, it explicitly preserves space for companies' subjective judgment on safety thresholds. As long as that subjective space exists, all kinds of shady maneuvers will happen.

Q: You've proposed the concepts of technical capital and consensus capital. What traits does the ultimate winner in autonomous driving need to possess? Does the player who wins in the end have to be one dominated by technical capital?

Xiaodi Hou: No, but first you need the technology. Lots of people are trying, but without the tech, you might spend 1,000x the cost and still not get there, or get something terrible. That's point one.

But there are enough smart people out there. I think between China and the U.S., the main players in passenger cars and trucks — the ones that really matter — number around ten or so. Their intellectual level, technical level, or technical judgment is sufficient to enter this contest for supremacy.

This is fundamentally first a battle among technical elites. More importantly, you don't necessarily have to have invented something — invention itself involves luck and historical context — but you have to truly grasp the essence of the technology.

Whoever can judge early that VLA doesn't work will avoid a lot of detours.

Q: What do you think of the recent industry talk that "VLA is dead"?

Xiaodi Hou: I completely disagree with the statement that VLA is dead. Because my view is that VLA was never truly alive to begin with, and there's no evidence it could ever work.

A lot of popular industry judgments are just emotion, not fact. When it comes to industry bigwigs' opinions, what's actually worth looking at is their judgment on long-term trends, not momentary concept hype.

For example, I think between China and the U.S., the companies that genuinely didn't invest time or money in VLA from day one — there are only a handful.

Q: What do you hope the industry-recognized core standard for autonomous driving will look like in ten years?

Xiaodi Hou: I think a lot of consensus is already slowly forming. To toot my own horn a bit, I believe I defined the concept of CPM.

I was the first in the industry to wave the CPM flag, criticizing myself before criticizing my peers, saying all of us need to feel shame and then rise to the occasion — any autonomous driving company that can't reduce CPM is playing games, including ourselves.

This was the point from which I genuinely started talking about the commercialization of autonomous driving. More and more people are now picking up on CPM as a standard. Unfortunately, they haven't adopted my advocated approach of calculating it on a fully loaded basis — meaning every cost gets counted, no exclusions or tricks.

But at least now, under pressure from analysts and investors, CPM has become a core pressure metric that every company has to face. MPI (miles per intervention) is a different story — it always comes with fine print: final interpretation rights belong to the company. You'll never get an honest answer from anything with that kind of clause.

Q: There's still a very mainstream view in the industry: L4 must rely on a sufficiently large fleet size to have any chance of real profitability. But for years you've never fully agreed with this logic.

Xiaodi Hou: I think there's actually a misconception here. If you only have one or two vehicles, five vehicles, of course it's hard to make money — that's not worth discussing. But many people subconsciously equate profitability directly with fleet size, and I don't think that logic fully holds.

Because you'll find that with CPM, especially in the early stages, many of the items that actually determine optimization space aren't that strongly correlated with fleet size.

There's a huge amount of work that can be done with very few vehicles, and this work can massively improve vehicle utilization and availability. This is fundamentally a technical problem first. You need to make vehicles truly stable and usable, rather than having this failure today, that problem tomorrow, until the vehicles can't run at all.

Q: Then why do you insist on self-operating fleets? What's the core difference between self-owned and self-operated?

Xiaodi Hou: I think people always expect some big news tomorrow that suddenly changes the business model completely, but our business model hasn't changed since 2023.

At least in the early stages, we'll definitely operate the fleet ourselves. There's actually an easily misunderstood point here — many people ask: do you insist on self-owning the fleet? Why must you self-own? But self-owning and self-operating are two different things.

We've never stubbornly insisted on owning all vehicles ourselves. Many of our vehicles today don't belong to us directly — we solve the asset problem through leasing, loans, and other financial arrangements. So ownership of the physical vehicle may not be in our hands.

But fleet operations must be done by us. Because what really matters isn't how many vehicles you own — it's whether you can get the entire operations system running smoothly and continuously drive CPM down.

Q: For a truly high-quality L4 company, beyond unit economics and scalability, is there a third metric?

Xiaodi Hou: Capital efficiency. How much you raised, how much you spent.

Q: How do you prove a company's scalability is real capability, not storytelling?

Xiaodi Hou: Orders. Battle reports can lie, battle lines don't.

Q: Then what counts as real orders in your view?

Xiaodi Hou: It's value genuinely created through the labor of your stated core business. For example, if I'm a robotics company and my core business is moving bricks, but I only do exhibitions every day, then exhibitions aren't your core business.

You say your robots will create massive economic value moving bricks in the future — fine, but how much do you earn moving one brick right now? That has to be calculable. In my view, all Physical AI companies should ultimately have a baseline value creation floor, which is essentially some form of moving bricks — for our autonomous trucks, it's revenue per mile of freight transport.

You can certainly say you also do technical services, technology licensing, but those aren't your core business. And licensing revenue is extremely unstable — you might have some this month, then nothing for the entire next year.

So the real test of whether a company is telling the truth or spinning a story is simple: look at how much money it's actually making from its stated core business, and whether that revenue model is sustainable and scalable.

Q: Tesla's Robotaxi fleet officially began nighttime operations a few days ago, and FSD v15 is pushing toward a broader rollout. Looking at their progress today, has your previous assessment changed?

Hou Xiaodi: No. Because I think you have to start with the most basic question: how many cars do they actually have, and have they achieved real density?

I take Waymo in the Bay Area at night and it's fine. But during daytime rush hour, getting from A to B often means waiting ten, twenty minutes. That wait time alone kills its viability as a replacement for Uber or Lyft.

Robotaxi isn't fundamentally about how impressive the technology looks—it's about whether you have sufficient fleet density to offer a competitive service. Waymo has quite a few cars in the Bay Area now, but the density still isn't there; Tesla's current fleet scale and operational status aren't at the level we're even discussing yet. Not to mention they still often have a safety driver in the passenger seat.

So I think talking about this now—it's not the same L4 conversation we're having.

Q: If someday Robotaxi achieves large-scale profitability before Robotruck, would you admit you were wrong?

Hou Xiaodi: No, I'd just congratulate them. If Robotaxi succeeds, it indirectly helps Robotruck too. I'm not dismissing their approach; I just want to explain some of the less visible advantages in the autonomous trucking industry that outsiders don't fully appreciate.

If Robotaxi succeeds, I won't abandon our direction because of it. We'll only reassess our business or strategy when we ourselves discover we've hit a dead end.

For now, our judgment is clear: fixed point-to-point freight patterns, stable contract repeatability, and real-world positive feedback are all supporting this path forward.

Q: In the near term, at least for the foreseeable future, you won't touch Robotaxi—you'll stay focused on Robotruck and similar scenarios?

Hou Xiaodi: Right, we'll stay focused on Robotruck in the near term.

Q: For L4 startups today, what is the most scarce resource?

Hou Xiaodi: The scarcest thing isn't simply money. It's a new financing structure suited to Physical AI, or rather, a new way to deploy capital.

Large models are dramatically reshaping organizational structures, and we're going through that ourselves. As long as the vision is clear, companies will evolve into new organizational forms; in two or three years, each will develop management characteristics as distinctive as personal style. But that's not the core bottleneck.

The industry is always chasing trends: two years ago when we went to the Bay Area, everyone asked why we weren't a SaaS company—SaaS had its complete metrics and valuation logic, so everyone piled in; now the same people question whether SaaS will be replaced by AI, and pivot to hyping embodied intelligence. You can never catch every trend, because they shift too fast; but sticking to business fundamentals often means not being understood by prevailing consensus.

So there are trade-offs. The poor have their way of surviving, the rich have theirs. We may not be understood, our valuation may not match embodied intelligence companies, but staying our course at least keeps us from walking into a dead end.

Physical AI has unique pain points that I summarize with two words: Metal and Mind.

Metal is the hardware assets—trucks, sensors. Mind is autonomous operations. We're a Mind-focused company, but scaling requires massive Metal. Who pays for that hardware and enables the business to quickly form a positive loop is a distinctive problem that only Physical AI companies encounter in their early success phase.

The existing capital system can barely support us in proving unit economics, but it's completely mismatched to the large-scale, long-term asset funding needed for scaling. The essence of Physical AI is unit economics multiplied by scalability. The hard part was never the former—it's the latter.

The problem now is: on one hand, many investors can't deploy capital at this scale; on the other, the market hasn't formed consensus on how capital should help physical businesses amplify value creation.

So for us, the scarcest thing isn't trucks, isn't orders—it's a new financing structure that truly connects capital, hardware assets, and operations.

Q: Today, nearly all domestic intelligent driving companies believe they can evolve from L2+ to L4 through large models, end-to-end, and world models. Why do you firmly disagree?

Hou Xiaodi: Because 99% of the companies you're talking about are fundamentally still L2+ companies. The number of companies in China that have genuinely pursued L4 long-term is actually very small. WeRide, Pony.ai—these happen to be companies whose views align closely with mine.

So sometimes you notice an interesting phenomenon: much of what L2+ companies promote daily isn't what we L4 companies are actually working on. There are only two possibilities: either we L4 companies are neglecting our proper business; or, they actually have no idea what problems L4 companies solve every day.

Q: I noticed you recently reposted an interview with WeRide founder Han Xu.

Hou Xiaodi: I quite like Han Xu. He tells the truth. The companies genuinely doing L4 domestically are just those few; most people hyping new concepts every day, I've never even heard of. It's the ones with solid technical strength whose words actually make sense.

The problems we're desperately busy solving every day aren't things L2 vendors claim they can handle. They're promising solutions to problems that aren't ours; the core issues we focus on, they don't address at all. Put bluntly, L2+ manufacturers are completely ignorant of what L4 actually has to face.

The conflict between supermarkets and restaurants is exactly the conflict between L4 and L2+ today. Some supermarkets constantly drum up noise about restaurants using pre-made dishes, expensive and unhealthy, urging people to buy ingredients and cook at home. All supermarkets benefit from this—it's essentially collective collusion, making restaurants the enemy.

But the profit logic is fundamentally different: supermarkets make money selling raw materials; restaurants make money per finished dish served. L4 is the restaurant, selling complete end-to-end autonomous driving services; L2+ is the supermarket, selling assisted driving as a semi-finished tool.

Q: But if a model at the level of FSD v15 achieves a qualitative leap, couldn't it progressively approach edge cases and ultimately reach L4-level safety?

Hou Xiaodi: I think there's a huge misconception here: many people assume L4 is about solving ever more extreme safety problems, but it's not.

What L4 actually needs to solve is a massive number of "what ifs."

What if a camera fails? What if a sensor degrades? What if the system detects it's unhealthy? What if water accumulation or fog reduces perception? What if hardware malfunctions? What if road conditions suddenly deviate sharply from system memory—should it keep taking risks, or proactively stop?

These are the problems L4 actually deals with daily.

Q: Can I interpret this as: everyone using the logic of model capability improvement to understand L4 is itself a massive misunderstanding?

Hou Xiaodi: Yes. Many people still have an intern mentality, always thinking if they just train the model a bit better. But the problem is, this never ends.

Even with AI as strong as it is today, can ImageNet reach 100% accuracy? No. Many labels themselves are wrong. So we're not pursuing an ever-better gradual curve, because that curve's marginal returns diminish and has no endpoint.

Much L2 progress is essentially optimization on an already-accepted benchmark, then using that performance gain to claim you're approaching L4.

But L4 doesn't compete on this dimension at all.

Q: If not on model performance, where is L4's real battlefield?

Hou Xiaodi: It's "There's no excuse"—no excuses whatsoever.

No matter what happens, we can't make excuses for ourselves. We can assume anything in the world could happen—even meteors falling from the sky. In that case, can the system always make relatively correct operations? And after making those operations, can the entire business loop still hold?

If both hold—first, we can always do the relatively correct thing; second, we can still make money on that basis—then it works.

I believe L4 will eventually be involved in accidents, but the key isn't zero accidents forever—it's when accidents happen, have we already done the most correct operation within our system's capabilities, and done our utmost to avoid liability.

So what we pursue isn't a slightly better model, but a system that can truly close the loop, take responsibility, and operate long-term.

Q: The industry today is full of discussion around end-to-end, world models—it's even affecting external judgments of company strategies. Do you think this is helping the industry understand technology, or prematurely manufacturing judgments?

Hou Xiaodi: The industry's biggest problem right now is too many concepts. Technical definitions are meant to aid understanding, not to judge with. But many people like to take a few keywords and render verdicts on companies and products.

End-to-end, world models—many of these terms lack clear definitions and are hard to falsify. End-to-end is so broad anyone can claim to be it or not be it; world models are even more absurd—groups discuss at length, and in the end nobody can clearly say what it actually is.

New concepts spread extremely fast, but many spreading them may not have seriously read a single relevant paper, let alone actually done the work. Then people use these keywords to label and judge each other, as if mastering a new term puts you on equal footing with people actually building things.

Before iPhone, Symbian theoretically had many smartphone technical labels too, but it just wasn't good. What ultimately matters was never what labels you wore, but whether you actually shipped the product.

The industry's definition battles are too noisy and unnecessary. What truly matters is returning to the thing itself: can this technology solve problems, can this product actually run.

Q: This seems to resonate deeply—any specific example of this concept misalignment?

Hou Xiaodi: Let me tell you a very real example. Once I was giving a presentation to investors, and halfway through, the other side suddenly asked me: is this JEPA?

I was completely stunned. I said, what's JEPA? I looked it up on the spot and found it was a concept Yann LeCun proposed, roughly meaning a model that can do both embedding and prediction.

I looked it up and said: well, isn't that exactly what we have? But the thing is, our 2023 slides already had basically this structure. I had no idea this term even existed. So you see, a lot of the time the industry uses some newly coined keyword to retroactively define others. But that's actually a very superficial thing to do.

Q: Let's return to a more practical question. The industry's evolving understanding of end-to-end systems — the growing emphasis on safety backstops, engineering patches — is this shift also connected to Tesla's moves earlier this year?

Hou Xiaodi: To understand Tesla's moves, you first have to recognize that it's fundamentally a company that manages stock price and expectations.

From a technical perspective, there's no essential capability gap between top-tier companies. With sufficient funding, everyone can access roughly the same technology. What truly creates distance isn't whether you have some particular technology, but how you integrate technologies into a product.

Every historical transition from old to new has never been about a single technology winning out. Google defeating Yahoo, iPhone defeating Symbian — none of these relied on any single technology, but on three more fundamental things: product form factor, organizational structure, and the coordination patterns that emerge around them.

Q: Does a system that relies on remote human backstops count as true L4? What's the criterion?

Hou Xiaodi: Human backstops aren't inherently sinful, just like truck platooning isn't inherently sinful. But there's one absolute core criterion: the reaction time required for human intervention.

If a person has to intervene within five seconds, hit an emergency stop button, or else the vehicle crashes — that's completely unacceptable and utterly unscalable. Fundamentally it's still one person per vehicle, just moving the driver from the cabin to a remote control center, salary still paid, costs not reduced one bit. The business model simply doesn't work.

Only when one person can simultaneously monitor a dozen or even dozens of vehicles does true scalability become possible.

Q: I suddenly recall you previously mentioned "foundation to all" — is this what people now call foundation models?

Hou Xiaodi: I think we need to take many steps back first. The concept of world models — nobody actually knows what it means, there's zero industry consensus. But foundation models have clear value; they're the genuinely exciting progress that ChatGPT brought us.

Regarding foundation model characteristics, opinions vary. Some say large parameter count is enough, others say linguistic capability is required — there's truth and falsehood mixed in here. I'll only discuss what I consider the two core, discriminating attributes that foundation models must possess:

First, it must be one model solving multiple tasks — that is, ultra-large-scale multi-task learning, not one model per task category patched together with a pile of models.

Second, it must produce emergent behavior — the ability to transfer learning: after mastering one core task, rapidly completing related other tasks at extremely low cost.

The most intuitive manifestation is dramatically reduced data labeling costs. Labeling cost is an auditable, quantifiable hard metric that directly reveals whether a company has truly touched the threshold of foundation models.

But here's something crucial to note: many people get causality backwards, or mistake correlation for causation. It's achieving a true foundation model that produces low labeling costs — not the reverse. Deliberately suppressing labeling costs doesn't mean you've built a foundation model.

It's like how the top student in class is rarely the most hardworking, but you can't conclude that since top students don't work hardest, none of us should work hard.

Q: My understanding is that on another level, you're saying human labeling isn't something AI companies of the new era should rely on long-term. This is completely opposite to the industry consensus that more data and more labeling makes better models — another counter-consensus from you.

Hou Xiaodi: Every word they say is correct, but "good" and "good" are different. I can make money carrying bricks at a construction site every day, but can I get rich doing it? Absolutely not.

More data, more labeling, better model performance — this brings linear, marginally diminishing returns. The more you invest later on, the slower the improvement. This "getting a little better every day" narrative is deeply misleading. The real gap is an era-level gap.

Foundation models are the dominant theme for the next 5 to 10 years after 2020. Whether a company has entered the foundation model era is visible from labeling volume. Only upon entering this era can you feel productivity-revolution-level improvement. Those still advocating stacking data and labeling — they've certainly never seen the view from a higher dimension.

Q: What are your current monthly labeling costs?

Hou Xiaodi: 5,000 RMB.

Q: What level is that?

Hou Xiaodi: An absurd level.

Q: All L4 companies now claim to have self-developed simulation platforms. What do you think genuinely useful simulation looks like?

Hou Xiaodi: First, simulation and labeling are fundamentally different things, not to be conflated. I don't know others' specific situations. But I can say clearly: many companies' understanding of simulation has completely gone off the rails.

Leading with "our simulation renders photorealistically" — that's pure distraction. Simulation's value has nothing to do with looking like photos. That might matter in year one; by year two or three it's worthless.

What truly matters is whether you can accurately simulate vehicle-to-vehicle interaction behavior, whether you can reconstruct real traffic scenarios — in plain terms, whether you can reconstruct the crime scene.

Q: What's the core value of simulation? What necessary connection does it have with fully driverless commercial profitability?

Hou Xiaodi: The connection between simulation and fully driverless commercialization is extremely direct. Autonomous driving software-hardware systems iterate continuously. Every new version must be re-validated for safety. If changing one line of code required driving 10,000 miles for verification, the company would have gone bankrupt long ago.

Here's the core contradiction: the system must iterate rapidly to improve, but every iteration must guarantee absolute safety. Simulation is the key to resolving this contradiction — a good simulation system, with one button press, can run through all traffic scenarios an ordinary person wouldn't encounter in their entire lifetime, within hours. Only this enables rapid system iteration while ensuring safety.

So don't listen to anyone bragging about their simulation. If a company hasn't even achieved true fully driverless normalized operations, their simulation definitely doesn't work. As for claims that simulation data covers all long-tail scenarios — that's definitely asterisked: final interpretation rights reserved by this company.

Q: Capital has migrated like seasonal birds chasing embodied intelligence, yet you're still holding down the fort with autonomous trucks. Is it because you believe the physical breakthrough will first emerge in trucks, not robots or Robotaxi?

Hou Xiaodi: Robotaxi may have breakthroughs in the future, but its core problem isn't technology — it's business. It requires extremely high vehicle density. Look at Waymo: over 2,000 vehicles deployed in the Bay Area, costs already astronomical, yet passengers still wait 15 minutes.

To reach operational density competitive with traditional taxis requires an enormous fleet. Which returns us to the core question: who buys these vehicles?

The aviation industry faced the same who-buys-planes dilemma early on. Autonomous trucks and Robotaxi face identical challenges now. Especially Robotaxi passenger vehicle capital investment — there's currently no mature financial instrument, nor an industry mechanism, consensus, and financial system like 1970s venture capital to support rapid development.

But I believe that as first-generation pioneers of Physical AI, this is our responsibility. We need to take this from zero to one, using actual results to show finance, insurance, and adjacent industries how it's done. When they see our data, see that we're genuinely profitable, and that this profitability is replicable, consensus will naturally form and capital will accelerate value creation.

Q: The outside world has long worshipped Aurora as the global autonomous driving benchmark. How do you demystify it?

Hou Xiaodi: First, Aurora's passenger vehicles exist only in PowerPoint. Have you seen a single Aurora passenger vehicle running on roads? The answer is no.

They merely claim technical transferability without any evidence to prove it. I recall them saying 70% of their passenger car and truck codebases are identical — well, humans and chimpanzees share 99% of DNA too. True technical commonality isn't talked about; it's demonstrated through actual operations. You know the temperature only when you've worn the shoe.

Q: Do you see any company in today's autonomous driving industry that, like DeepSeek, can achieve breakthrough results with minimal resources?

Hou Xiaodi: Using minimal resources to do something truly impressive — currently I only see our company doing this.

There may be others I'm unaware of, but from publicly disclosed results, there should be no second. I also hope more such companies emerge, to pull the industry's narrative back onto the right track. After all, a restaurant's narrative dispute isn't with another restaurant — it's with supermarkets and wet markets. Once you grasp this, many things become clear.

Q: Xie Saining called you his idol, and many deeply admire your persistence through the near-death autonomous driving industry all these years. Where does that almost stubborn drive come from?

Hou Xiaodi: Being called an idol feels pretty good. The core motivation: I've seen a success with extremely high certainty, almost inevitable. Autonomous driving is one of the extremely rare, almost unique exceptions I've encountered in my lifetime: solve the three steps in sequence — fully driverless technical safety, unit economics, scalability — and success is inevitable.

If you asked me to do embodied intelligence, to replicate the next ChatGPT moment — I'd look in the mirror and know the probability of success is too low, that kind of thing depends too much on luck. Autonomous driving is different. I've stepped in just about every pitfall this industry has. I'm confident I can walk through every step.

I've founded many companies. I know too well how fatal this uncontrollable uncertainty is. I don't want my efforts to ultimately fail to control my own fate. But autonomous driving gives me this sense of control. As long as I do what needs to be done, success is guaranteed.

Q: In the past year, what was your costliest business judgment, or the one most worth reflecting on?

Hou Xiaodi: Actually there are no dramatic, earth-shattering mistakes. We're reflecting and reviewing every moment. Whether correct or incorrect decisions, they're all converging toward the right direction.

There's only one true error: arrogance. It's knowing you're wrong and not changing, being unable to examine your own decision-making mechanism with equanimity. Fortunately I didn't commit this error last year. Every decision I make now, I can sleep well after making it.

Q: What team weakness do you most want to address now? You previously said you need true believers more than purely the most technically skilled people — why?

Hou Xiaodi: The biggest gap in our team has just been filled. We just recruited a very important person, whom I can't announce yet. He comes from the traditional trucking industry and will be in charge of commercial operations. The trucking industry runs deep — there are long-established unwritten rules and complex problems that we tech people simply don't understand. We need someone who truly knows this industry inside and out, with deep accumulated experience, to help us execute on the ground.

Q: If you could say one thing to the autonomous driving industry ten years from now, what would it be?

Hou Xiaodi: I hope that in ten years, autonomous driving won't need me anymore.

The only reason I matter now is because too many people are doing this wrong. I'm not talking about aesthetic differences in approach — I'm talking about basic business logic that doesn't even hold together. How does losing six dollars for every dollar you make become a "great future"?

I've never believed that losing money makes for a good business. When these wrong voices get loud and drown everything out, that's when I have a reason to fight: I want to establish fair competitive standards so everyone competes on healthy metrics; I want to call out the people who deserve to be called out, and support the people who deserve support.

I'm neither the fastest coder nor the most elegant at deriving formulas. Once the industry returns to rationality in ten years, once everyone is on the right track and bringing their own strengths to bear, my work will be done.