A Conversation with Momenta's Cao Xudong: Beyond Moore's Law for Autonomous Driving | Z Circle
The hardest choice is not changing your original choice.
Z Circle is a column about people.
Momenta founder Shao Xudong (Cao Xudong) graduated from Tsinghua University and previously worked at Microsoft Research Asia and SenseTime, accumulating over ten years of R&D and management experience in the AI industry.
In 2016, Cao founded Momenta. That same year, ZhenFund invested in Momenta's angel round. Today, nearly 100,000 vehicles on Chinese roads run Momenta's autonomous driving software.
In a recent in-depth interview, Cao shared how Momenta persevered through the cycles of the intelligent driving industry, his views on the end-to-end large model trend, and his outlook for the future of autonomous driving. Below is the full interview.

This article is republished with permission from LatePost (ID: postlate)
Author: Cheng Manqi
Editors: Huang Junjie, Song Wei
By the first half of 2024, nearly 100,000 vehicles on Chinese roads were running Momenta's autonomous driving software. After setting a destination, these cars can change lanes, overtake, navigate intersections, and avoid e-bikes and pedestrians on highways and most urban roads by themselves. They come from SAIC, BYD, GAC... and all run on NVIDIA's autonomous driving chips.
Going forward, dozens of models from more than ten brands including GM, Toyota, and Mercedes-Benz will also carry Momenta's software, making it the company with the most advanced intelligent driving orders to date. Among suppliers that have already delivered comparable high-end solutions, only Huawei and Momenta qualify. Among automakers with self-developed intelligent driving systems, only NIO, XPeng, and Li Auto.
Not long ago, reports emerged that Momenta is planning a US IPO. If successful, it would become China's first publicly traded autonomous driving company.
Momenta has experienced its ups and downs. In 2018, it became China's first autonomous driving unicorn. But during the Chinese autonomous driving funding boom of 2019–2020, while Pony.ai, WeRide, and others raised successive rounds, Momenta announced no funding progress. Its R&D director even left to join NIO.
Then in 2021, Momenta raised $1 billion in a single year—three-quarters of its total historical funding—with investors including SAIC, Toyota, Daimler, GM, and Bosch, comprising major Chinese and foreign automakers and suppliers.
The company's core strategy since founding has not changed, but the industry's coordinates have shifted.
In 2016, Cao founded Momenta at age 30. Before that, he studied physics as an undergraduate at Tsinghua, dropped out of a direct PhD program, and crossed over to Microsoft Research Asia to study AI. In 2014, he joined SenseTime, then a newly founded AI star company. By the time he founded Momenta, he had risen to SenseTime R&D director.
The first wave of autonomous driving companies generally took Google's Waymo as their anchor, aiming to build Robotaxis: outfit cars with lidar, conduct road tests, and develop fully driverless solutions to disrupt the entire automotive industry.
But when Cao founded Momenta, he never believed that testing with hundreds or thousands of vehicles and having engineers write code could produce full autonomy. He wanted to simultaneously build limited intelligent driving features for automakers to obtain "100 billion kilometers of real-world road data," then feed that data back into developing full driverlessness.
This path was initially seen as lacking imagination. "Toyota is a supplier to Pony [Pony.ai would modify vehicles for testing], while Momenta's goal is to be a supplier to Toyota," one investor who passed on Momenta once explained the difference between the two approaches, believing the ceiling for mass-production suppliers fell far short of Robotaxi.
Beyond pursuing "mass-production" autonomous driving, the keywords repeatedly appearing in Momenta's early fundraising decks and Cao's early interviews were "data-driven": finding ways to build intelligent driving systems primarily through deep learning and other data-dependent methods, whereas traditional systems required programming extensive rules and could not directly learn how to drive from driving data.
"Now it's industry consensus, but back then truly no one knew how to make the entire autonomous driving pipeline data-driven," Cao said. He admitted he didn't fully know the answer himself at the time, but judged it a necessary condition for achieving full autonomy, because under traditional rule-based methods, once the system encountered a corner case not covered by rules, it wouldn't know what to do—an ever-present safety challenge.
"His original consideration wasn't to start his own company," said Yuan Liu, Momenta's angel investor and partner at ZhenFund. Before founding Momenta, Cao had also reached out to Google and Apple, but found these companies weren't seriously considering that achieving full autonomy would require massive amounts of vehicles and data. "He didn't start a company for the sake of starting a company—he genuinely believed this was the optimal way to build driverless technology."
The non-mainstream path of mass production plus data-driven development made Momenta's progress appear slow at times. While a cohort of Robotaxi companies competed on testing zone size, fleet numbers, and ever-fewer disengagements, Momenta—focused primarily on pre-development work with automakers—had few demonstrable results to show.
The turning point came in 2020: Tesla's massive sales success, with its market cap surpassing Toyota and Volkswagen combined, led more automakers to view advanced intelligent driving as an inevitable direction—benefiting mass-production suppliers in finding customers. In March 2021, SAIC invested in Momenta and procured its solution for its EV brand IM Motors. That same year, Momenta completed its massive funding round.
The industry's primary metric for evaluating intelligent driving companies was no longer PhD count, papers, disengagements, or funding and valuation, but rather automaker customer count, model order volume, and delivered product reputation. A number of originally Robotaxi-focused companies also attempted to enter the mass-production autonomous driving market, but made slow progress. Momenta's competitors shifted from startups to major corporations—Huawei's Intelligent Automotive Solution unit, DJI Automotive, and SenseTime's Auto business.
The gradual rise of end-to-end large model intelligent driving trends in recent years has given data-driven development a more concrete implementation path.
In 2021, Tesla proposed using Transformer architecture models for perception at its AI Day, seen as the embryonic form of "end-to-end." But Tesla didn't elaborate on how the decision-making component of the intelligent driving system should "think" about driving using deep learning methods—other companies had to find their own paths.

Autonomous driving has three modules: perception, planning and decision-making, and control. Perception "sees," decision-making "thinks" about how to drive, and the control module executes driving actions. End-to-end technology uses one large model to accomplish the entire process from perception to decision-making.
Momenta began pre-research on deep learning-based decision-making modules in 2020. By early 2022, with mass-production of highway NOA imminent by year-end, a choice lay before them: use deep learning for the decision-making module in their mass-production solution, or stick with traditional rule-based methods? Momenta ultimately chose to prioritize deep learning methods, with rule-based methods as a safety backstop.
In 2024, Tesla FSD V12 launched, with perception through decision-making becoming one continuous large model, making "end-to-end" mainstream. According to multiple media reports, Tesla FSD may launch in China this year.
End-to-end large models are essentially an advanced form of data-driven development: sensor data from cameras and other inputs goes in one end, and vehicle trajectory comes out the other, telling the system how the car should drive.
After making that decision in early 2022, Momenta mass-produced a highway NOA using deep learning for decision-making by year-end. They later merged their vision model and decision model into one continuous end-to-end large model, making Momenta among the first Chinese companies to mass-produce an end-to-end solution. Currently, some owners of SAIC's IM Motors, BYD's Denza, and GAC's Hyper have already gained access to its end-to-end mapless NOA.
Changes and competition in the intelligent driving industry continue. Tesla remains the most favored: building its own cars, self-developing intelligent driving systems and chips. An unanswered question persists: how much space will automakers with in-house development ultimately squeeze out for suppliers?
Companies entering intelligent driving from the chip angle, such as NVIDIA and Horizon Robotics, have also complicated the battlefield, now respectively serving Mercedes-Benz and Volkswagen, two important automakers.
Faster multi-model delivery is Momenta's current advantage. The new energy vehicle price war has now lasted a year and a half, with cost pressures transmitting to the intelligent driving tier—automakers paying little or even no R&D fees has become common, and introducing multiple suppliers for the same model to compare prices and compete has become standard practice among Chinese automakers. Without high delivery efficiency, the more customers an intelligent driving company has, the more it loses.
"Momenta's strength lies in multi-platform adaptability, being able to simultaneously support multiple hardware platforms with rapid deployment and iteration," said one intelligent driving evaluator.
Momenta now delivers dozens of models simultaneously with a team of over 1,300 people. Huawei's intelligent driving team, including outsourced personnel, totals over 4,000 people serving roughly ten models.
Facing new competition, Cao believes that globally there may ultimately be only 3–4 intelligent driving Tier 1 suppliers, because this is an industry with high barriers and strong scale effects—the performance evolution brought by end-to-end technology will continue to strengthen leading companies' advantages.
Cao summarizes this as intelligent driving's Moore's Law: software experience improves 10–100x every two years; hardware BOM (bill of materials) cost halves every two years.
He believes that the surviving leading companies will likely be Tier 1 suppliers that vertically integrate intelligent driving software and computing platforms—only this structure makes it possible to surpass intelligent driving's Moore's Law in cost reduction and performance improvement. Momenta currently does not control computing platforms, which is its weakness and also its opportunity for the next phase.
Below is the conversation with Momenta founder and CEO Shao Xudong (Cao Xudong).
01
"FSD entering China is a turning point where good money drives out bad"
Q: A little over a year ago, your biggest concern was whether high-end intelligent driving solutions were good enough to help automakers sell cars. Now, launching premium intelligent driving solutions with NOA functionality has become consensus among automakers. How did this change happen?
Cao Xudong: The key is the improvement in intelligent driving experience—tenfold in two years.
Last year was an important inflection point. Early this year I went to the US to test-ride FSD v11 and Waymo. At that time, v11's experience still had a huge gap with Waymo. But the latest v12 is already very close to Waymo. Tesla FSD evolved at least tenfold in that half-year.
Q: What about your own evolution speed?
Cao Xudong: Our urban NOA solution has improved 10x from early this year to now. Basically, if there's a road, it can drive; if there's navigation, it can drive.
Q: Any performance that impressed you?
Cao Xudong: Around Qingming Festival this year, I was passing through a small town one night when someone was burning paper money by the roadside, taking up half a lane. I told the safety driver we'd probably need to take over. But as the car approached the fire, it swerved around it. We tried several more times — it dodged every time. I even filmed it.
The capabilities of end-to-end large models sometimes exceed imagination. Burning paper piles were never defined in our perception models.
Q: Within China, including automakers with in-house development, who sits in the first tier of end-to-end advanced autonomous driving? Who's the rival you care about most?
Cao Xudong: Huawei. We've had clients evaluate us head-to-head — in Shanghai and Shenzhen, Huawei's two strongest cities, we're competitive with each having strengths. But in other cities like Beijing, Baoding, Hangzhou, and Guangzhou, our product experience and generalization capability are better than Huawei's.
Q: You have the most model designations among advanced autonomous driving suppliers, but not the highest number of vehicles in service. QCraft's highway NOA solution rolled out to 400,000 vehicles this year, surpassing both you and Huawei.
Cao Xudong: That number changes fast. To judge an autonomous driving company's market position, the most important factor is post-mass-production product experience, followed by client quality, then client quantity. Because any high-quality client wants to give their users the best product — cooperation expands from one model to three, five, ten models, and volume follows.
Q: Elon Musk says FSD could enter China by year-end. What happens then?
Cao Xudong: It's somewhat like when Tesla was brought into Shanghai — good money drives out bad. Before that, Chinese EV quality was uneven. But after Tesla came in, only automakers that could match Tesla's experience could survive and thrive.
FSD coming to China will intensify competition, at least first driving up perceived value, consumer reputation and awareness high enough that people are genuinely willing to pay.
If you drive prices down low before experience and value are sufficiently established, and then have to sacrifice value to hit those low prices, consumers end up spending a few thousand yuan for a solution that doesn't work well. They'll conclude autonomous driving is a scam. Even cheap, it won't sell sustainably.
Q: FSD is also licensing externally — Musk says they'll sign their first automaker client this year. What changes would that bring?
Cao Xudong: If you were a major automaker, would you choose FSD? No — the commercial risk is too great.
02
"There might be 20 global top-tier automakers, but only 3 or 4 autonomous driving companies"
Q: You previously said that considering R&D and operational investment, an autonomous driving company needs to serve 10 million vehicles annually to live relatively well. What would achieving 10 million actually mean? (Over the past 3 years, global annual passenger vehicle sales have been between 55–65 million.)
Cao Xudong: At least global top two. Autonomous driving will likely end up with three or four companies globally, maybe two or three in China. The pattern could be 7:2:1 — first place takes 70% of the market.
Because this is a sector with strong moats and strong scale effects. Such fields tend toward high concentration, like chips.
And consumer demand for autonomous driving is uniform: safety, peace of mind, efficiency, intelligence. There's only good and better. Cars, by contrast, have strong fashion and interior-design attributes — to each their own — so concentration won't be as high. There might be 20 global top-tier automakers.
Q: What form do you think the autonomous driving companies that survive to the endgame might take? Why?
Cao Xudong: If the endgame thesis holds, you'll see the remaining two or three or four companies ultimately pursuing vertical integration.
I've previously described the Moore's Law of autonomous driving: software experience improves 10 to 100x every two years; hardware BOM cost halves every two years. BOM cost was 15,000–20,000 yuan two years ago, now it's 7,000–10,000, and it'll reach four to five thousand in the future.
Vertical integration is required to achieve this. Our goal is to surpass autonomous driving's Moore's Law.
Of course, not everything needs to be built in-house. Sensors and domain controllers, for example — the industry already does these well. The optimal approach is identifying the best one or two companies for deep collaboration.
Q: Corresponding to your 7:2:1 endgame, leading autonomous driving companies will need even greater volume, serving more price segments. You've mainly served mid-to-high-end models. What's next?
Cao Xudong: From the second half of this year to first half of next year, our solution could mass-produce on 150,000-yuan vehicles. Besides highway NOA, it'll include advanced features like urban memory pilot. This is automotive industry规律 — new technology always debuts on expensive cars first, then moves down a price tier once products mature and costs drop.
Q: Your past solutions were mainly based on NVIDIA chips; this year you also started working with Qualcomm. Is this primarily to bring advanced solutions to cheaper vehicles?
Cao Xudong: That's one reason. Another is Qualcomm's excellent power efficiency, enabling deployment on gasoline vehicles and better suitability for hybrids.
03
"Building end-to-end isn't hard; raising its floor is"
Q: The industry is all talking about end-to-end large models now. How do you understand and implement end-to-end?
Cao Xudong: The broad direction is right, but beneath it there may be 10,000 paths, of which only 10 actually work.
We started experimenting with end-to-end, or deep learning approaches, as early as 2020. For the perception module, this became relatively mature by 2020–2021 — good companies in the industry roughly knew how to do it.
The hard part was using deep learning for planning, the decision-making and path planning. At the time, some believed planning and control simply weren't suited for deep learning; there were even extreme voices saying anyone who advocated deep learning for planning and control was a fraud. Because deep learning methods are less interpretable, some conflated uninterpretability with unsafety — but these are two different things.
Q: Using end-to-end for planning and control was a non-consensus bet back then?
Cao Xudong: Yes. A major decision we made was in early 2022 — we needed to mass-produce highway NOA by end of 2022, and faced a choice: in one year, would we deploy deep learning-based planning and control, or traditional rule-based planning and control?
We had both at the time. Deep learning offered high ceiling, low floor — stunning in some scenarios, but sometimes making bizarre errors. Traditional methods had low ceiling, relatively high floor. So the former occupied a wide range, the latter a narrow range. A wide range means opportunity, but also risk.
We ultimately decided to go with deep learning planning and control — a fairly risky decision, because we didn't yet fully know how to solve its very low floor problem, and we only had one year; moreover, autonomous driving has extremely high safety requirements — even solving 99% of floor issues isn't enough for product release. If the car occasionally behaves strangely, users immediately stop using it.
Q: How did you later solve the low-floor problem of deep learning methods?
Cao Xudong: It's a system. First, we used a "functional scenario tree" to identify problems. We mapped out hundreds of细分场景, evaluated them to know which scenarios deep learning handled okay and which might have unexpected issues.
Then we analyzed whether it was an algorithm problem or data problem. If algorithmic, we'd modify model architecture, run experiments, check if the problem was solved.
If data-related, we'd examine whether bad driving behaviors had been introduced. Because if you feed human driving data in, the model learns it all — it struggles to distinguish good from bad. The low floor of end-to-end stems from this.
This requires building a data pipeline, knowing how to create, mine, and clean good data. We have a dedicated team for this, with an entire data production line for creating, mining, and cleaning.
Finally you need an evaluation system to judge whether improvements genuinely raised performance.
By end of 2023, we'd merged the deep learning planning and control with the perception module into an end-to-end large model. It's like LEGO — assembling it isn't hard. The hard part is continuing to solve the low-floor problem. The system I described carries through this process.
Q: End-to-end mapless autonomous driving solutions are still rapidly evolving, with many technical choices ahead. How do you increase the probability of consistently making correct technical judgments?
Cao Xudong: Our company has a culture called "low-cost, short-cycle trial and error." Before exploring, first clarify the core hypothesis, extract it for testing — this shortens cycles and reduces costs.
If we'd jumped straight into end-to-end large models, each training run might cost over a million dollars. Instead, we first tackled the least mature piece of end-to-end — deep learning planning and control — which cost far less money and time. And conclusions and methods from this process could transfer to larger models.
Our company culture has two tenets: customer value-centricity, and low-cost short-cycle trial and error.
04
"You can never achieve full driverless with 1,000 vehicles"
Q: Momenta was hard to categorize in its early years. Unlike established Tier 1s like Bosch, Continental, and Valeo, you didn't do lane-keeping, adaptive cruise control, and other L2 assist features — your goal was full driverless. But unlike contemporaries like Pony.ai and WeRide that tested on retrofitted vehicles, you chose to do mass-production advanced driver assistance for automakers. Why?
Cao Xudong: Because we believed full driverless was impossible without mass production. Full driverless requires solving one-in-ten-thousand, one-in-a-hundred-thousand edge cases — you can never achieve full driverless with 1,000 vehicles.
The path we saw was later summarized as "one flywheel, two legs." The flywheel is the data flywheel: using data-driven methods to discover and solve edge cases, encompassing algorithms, the entire data pipeline, and an automated toolchain.
The two legs are mass-production autonomous driving and full driverless — these two products share the same architecture. First run the full driverless system on test vehicles; once relatively mature, push to mass production; then mass-production data feeds back to develop the next-generation full driverless product. The two legs coordinate.
Q: Years ago, another autonomous driving company founder told you that trying to do both might mean "chasing two rabbits, catching neither."
Cao Xudong: It is indeed harder. Other Robotaxi companies can use massive compute and numerous high-performance sensors, but we must work on mass-production chips, with mass-production sensors, and our Robotaxi software architecture must be compatible with mass-production software architecture — all constraints.
Short-term, this increases difficulty. But long-term, the advantage lies in the synergy of technology flow and data flow between the two legs, so this is still the right choice.
Q: Taking your logic to its extreme, it seems you should build your own cars.
Xudong Cao: We've discussed it, but we don't think we could create new value there, and it's not our passion. You can tell that Bin (Li Bin) and Xiang Li genuinely love building cars, just like we genuinely love AI.
Q: You were founded in 2016, but didn't really land your first customer until 2020. Why did it take so long?
Xudong Cao: Tesla and XPeng helped us a lot. Before 2020, when you went to talk to customers about advanced features, it was hard to move orders forward. Most people didn't think much of Tesla back then, and its autonomous driving approach was somewhat radical — more automakers preferred to roll out autonomous driving gradually.
In 2020, Tesla's sales and market cap soared, and many automakers began investing more heavily in intelligence. Before that, we might be talking to a director or VP at a car company; after that, it was presidents and CEOs coming to the table.
Q: Some people think suppliers choosing big clients is a bet. Take air suspension, for example — Konghui bet on Li Auto, Baolong on XPeng, and Konghui ended up becoming China's market share leader in air suspension. Your first major customer was SAIC's IM Motors, which has sold fewer than 100,000 units cumulative since its 2021 launch. Was that luck or a gamble?
Xudong Cao: Sometimes it really is fate. But companies that started mass-producing high-end solutions later wouldn't have had more freedom to choose customers either. For the same customer today, if Momenta goes head-to-head with a new player, Momenta's odds of winning are definitely higher.
05
"Once you've done something two or three times, you must use tools to do it"
Q: You were the earliest autonomous driving company to start mass production deliveries, and now you're also the one delivering on the most models simultaneously. Mass production delivery is a hurdle for B2B tech companies. How many stages have you gone through on delivery?
Xudong Cao: The first stage was IM Motors' delivery — everything was a first. The second stage was BYD's delivery: a new customer, a new model, and we still hit the 0-to-1 part. Now we're in the third stage: whether it's an existing or new customer, it's all 1-to-N.
Q: What's the difference between 0-to-1 and 1-to-N?
Xudong Cao: The most direct difference is human efficiency. For IM Motors' delivery, we had over 400 people working for a year and a half — that's R&D and delivery combined. In the second stage, it was roughly several dozen people for about half a year. Now, delivering a new model for an existing customer takes just a few people for half a year. We're currently delivering on several dozen models simultaneously.
Q: Delivering several dozen models at once — even Huawei hasn't reached that state. How was this achieved?
Xudong Cao: Through a strong mainline product and a rapid adaptation process. The mainline is a product and solution that continuously accumulates value — you absolutely must hold to one mainline without branching off.
Q: Holding to one mainline doesn't sound that hard?
Xudong Cao: The reality is that people on the front lines constantly face various short-term pressures, which easily leads to short-term behavior. Say there are 10 ways to solve a problem, and only one is on the mainline — but the other branches might be easier and less work in the short term, so someone might choose a branch. This creates multiple branches across different customers and models, and once you open a branch, you have to continuously develop and maintain it. Everyone gets dragged into a war of attrition, firefighting on every branch without doing any of them well.
When you retrospect why a team collapsed over a period of time, it's often because they chose the easier, more short-term branch instead of the mainline — the one with higher long-term ceiling and greater accumulative potential.
Q: You as a supplier want long-term accumulation on a mainline, but automakers want faster, more differentiated services and solutions. How do you balance both?
Xudong Cao: This requires a set of processes and tools that can rapidly adapt to different customer needs on a unified architecture. We summarize them as the Three Treasures: M-Framework, M-Adapter, and M-Box.

Momenta process schematic, where MF's Mpilot refers to mass-production autonomous driving solutions and MSD refers to fully driverless solutions.
M-Framework is a software algorithm architecture that can be applied to different vehicle models, sensors, and chips, and also helps frontline people determine whether something is on the mainline. M-Adapter is a standardized and automated process for adapting to new vehicle models; M-Box is a hardware development kit.
Adapting autonomous driving systems to different vehicle models is a process of continuous optimization on engineering prototypes. During mass production, the hardware base of engineering prototypes is quite unstable — cameras, domain controllers, and other components can have various issues.
So you need tools like M-Adapter to efficiently inspect prototype status and identify problems. Otherwise, once a problem flows from upstream to downstream, the cost and difficulty of discovering and solving it could be 100 times greater. And when suppliers are modifying parts, we can use hardware from M-Box to replace problematic components, ensuring parallel development progress. There are thousands of correlated signal components in vehicle development — if even one fails, nothing works.
Q: This process sounds very mature now. But after 2021, you once faced considerable delivery pressure, used a lot of people, and Momenta became known as a company with very high work intensity.
Xudong Cao: There's a process to this. You must first have people who can develop tools do the delivery — the hard, grinding work — so they can understand the delivery process and its pain points. Otherwise, developing tools would be shooting in the dark.
Q: You've said you hope people don't see themselves as tools, but learn to use tools. But this goes against human nature, because when tools are immature, being a tool takes less brainpower and is easier. How do you get people to actually use tools?
Xudong Cao: Whoever does well in this area gets more rewards and opportunities: money, recognition, being held up as a model. At the same time, select people with stronger capabilities in this area as leaders, and have them guide others.
Once you've done something two or three times, you must use tools to do it — you can't keep using people. Everyone needs to know that using tools to solve problems is what the company encourages.
Q: Actually, the process-oriented and tool-using mindset you're describing is fairly routine for automotive suppliers that produce standard parts. Why is it such a massive undertaking for intelligent driving companies?
Xudong Cao: Autonomous driving algorithms and technology are iterating rapidly. Establishing standardized, process-oriented delivery systems for an object that's continuously evolving is much more difficult.
Q: One mainline, Three Treasures — this doesn't sound "intelligent" at all. It sounds more like a slogan from an assembly line.
Xudong Cao: This all came from practice. I remember Eason Chan has a lyric: "You should live like an advertisement."
Many management practices should also be like an advertisement — only then will people really remember and understand them, and they'll enter people's minds.
The ideas of data-driven algorithms, and mass-production assisted driving running in parallel with fully driverless driving, existed before the company was founded. We were working in this direction in the early years too. But by late 2018, if you asked around from top to bottom and left to right, you'd find only 20% of people were working in this direction — 80% were still in a chaotic state. Later we summarized this strategy as "one flywheel, two legs," and we even wore it on our bodies.
Q: You're wearing it today.
Xudong Cao: From 2021 to 2023, we also wore the Three Treasures on our bodies. (On the back of the culture shirt.)
We invested especially heavily in the Three Treasures during these three years. Now they're relatively mature, so recently we changed it to "Save a million lives in ten years." As we serve more and more vehicle models, safety has become more important.

Front of Momenta's culture shirt ("one flywheel, two legs") and back (currently "save a million lives in ten years"). Some Momenta engineers joke that they work at "Flywheel Company."
06
"The hardest choice is not changing your original choice"
Q: Large language models have been booming from last year to now. Do you ever feel like you've stepped away from the center of the storm?
Xudong Cao: I've experienced similar situations several times. When large models were hottest last year, multiple investors suggested we work on large models, since we have a lot of GPUs. I said we would definitely do autonomous driving large models, but we'd sit out other scenes for now.
Q: Before the Dragon Boat Festival, in an autonomous driving engineer group chat, people were discussing which companies gave out zongzi, and someone said "First rule out MMT (Momenta)." Did you actually give them out? What do you think of your industry image?
Xudong Cao: No. This is probably just a meme people like to play with. Our one-liner to frontline employees is "More money, high human efficiency." It used to be "High human efficiency, more money," but we changed it — because high human efficiency is what I want, and more money is what employees want. We need to guarantee what employees want first. For frontline employees, I won't talk about passion or paint big pictures. Expecting them to fully understand and identify with the company's strategy, mission, and vision is wishful thinking.
Q: Where do you spend most of your energy now?
Xudong Cao: Technology and product. You'll find that technology and product have especially strong leverage effects — when the product is good, all customers are happy; when the product has problems, all customers are unhappy.
Q: You posted on Moments specifically about leverage not long ago.
Xudong Cao: Yes. When I first started my company, I read Andy Grove's High Output Management. I forgot almost everything in it, but one concept stuck with me: "managerial leverage."
You must do things with leverage effects; if you can do two things, and one has leverage of one while the other has leverage of a hundred, focus on the one with leverage of a hundred.
Q: What is organizational leverage?
Xudong Cao: Continuously hiring better people and selecting better people. The ultimate expression of company culture is who rises and who falls, who stays and who leaves.
Q: As CEO, what have you improved on most in the past year or so?
Xudong Cao: It's still what I just mentioned about leverage effects. Not only do I myself need to invest in things with leverage effects, but I'm also directing the energy of the people who know the situation best and bear the greatest responsibility toward things with leverage effects.
Q: How is this achieved?
Xudong Cao: There are many methods. For example, we have something called "Implement Three Grasps." Grasp people and tasks, grasp alignment, grasp progress.
Grasping people and tasks: first, you need to position things well and give them a good name that reflects their positioning; at the same time, clearly identify who is most relevant to this task. Grasping alignment means aligning on the direction and principles of the task, the general approach, each person's responsibilities, and the timeline.
Grasping progress means communicating frequently about what matters. This doesn't happen through weekly reports or weekly meetings — it's more about daily conversations. You'll find that often, people don't neglect important things because they don't know they're important, but because important things tend to be vague, and people don't know how to do them. So of course they default to what they do know. In reality, the less clear something is, the more you need to share information. You need to keep important topics top of mind for the right people — talk about them every day, think about them every day, and good actions will eventually emerge.
Q: You've been an entrepreneur for nearly eight years. What's been the hardest decision you've had to make?
Cao Xudong: Not changing my original choice — insisting on using the data flywheel to build autonomous driving, and establishing data-driven algorithms. This seems like industry consensus now, but back then, truly no one knew how to use data to drive the full autonomous driving workflow. I can say now that we've achieved full workflow data-driven development.
Q: For Momenta to survive as a top player in the intelligent driving industry, what are the biggest pitfalls and risks ahead?
Cao Xudong: We still need to surpass Moore's Law in intelligent driving, and we need to solve policy barriers for going global. The rise of Chinese industry must involve selling China's good products worldwide — Japan, South Korea, Germany, and the US all did this before.
Q: Beyond autonomous driving, what other technologies excite you right now? You mentioned being very interested in humanoid robots over a year ago — have you explored that?
Cao Xudong: Not yet, because autonomous driving is an enormous market and a tremendous opportunity. Right now, we're absolutely focused on becoming number one in China and number one globally. We need to stay focused.
When I was in college over a decade ago, I read a book called The Future of Artificial Intelligence. Back then, I was genuinely electrified reading it — every cell in my body was excited. A few years ago, I picked it up again and found that the book had fallen far behind my understanding of intelligence. This showed me how many steps the entire industry had advanced beyond the pioneers of the early 2000s. Only looking back did I feel the full weight of that.
General AI will definitely happen, and in our lifetimes, we will certainly be among its important contributors. If autonomous driving takes ten years, this next thing could take twenty or thirty. I'll still be in my fifties in twenty years.
Recommended Reading

