A Conversation with Cao Xudong: Eight Years of Entrepreneurship, Why the "One Flywheel, Two Legs" Strategy | Z Talk

真格基金·March 4, 2025

When making decisions, always let customer value be your guide.

Z Talk is ZhenFund's channel for sharing perspectives.

Momenta founder Xudong Cao graduated from Tsinghua University and previously conducted AI research at Microsoft Research Asia and SenseTime, among other companies, bringing extensive R&D and management experience. In 2016, the 30-year-old Cao resigned from SenseTime to found Momenta, dedicated to building a "data-driven autonomous driving" solution.

As Momenta's first-round investor, Yuan Liu, partner at ZhenFund, recalls, "Autonomous driving is an irreversible trend that will inevitably be realized sooner or later." Since its angel investment, ZhenFund has witnessed Momenta's rise. By the end of 2023, Momenta had secured 10 mass production designated projects covering 30 vehicle models.

In a recent in-depth interview, Cao shared Momenta's perseverance through the cycles of the intelligent driving industry. He stated that the company's ultimate goal is achieving scaled L4 autonomous driving, which requires mass production of L2 systems to accumulate data and experience. He believes Momenta is "the earliest and most successful company to transition from L4 to L2."

Cao also reflected that over eight years of entrepreneurship, he has transformed from a technical specialist into a business manager, and the company's cultural orientation has undergone tremendous change—"from a loosely structured research institute to a customer-value-centered team that can fight hard battles and win them."

Below is the full interview.

Interviewed by | China Entrepreneur reporters Wentong Wang and Yafei Ren

Written by | China Entrepreneur reporter Wentong Wang

Edited by | Jiying Ma

Cover photo by | Pan Deng

In front of a wall covered with photos from "Town Hall" events (Momenta's all-hands meetings), Momenta founder and CEO Xudong Cao, preparing for the interview, quickly removed his jacket to reveal a black T-shirt underneath, printed with Momenta's data-driven flywheel logo and the company slogan—"TEN YEARS SAVE ONE MILLION LIVES."

In recent years, this outfit has virtually become a second skin.

Momenta's photo wall, displaying photos from Town Hall team-building events

In 2024, he earned a title on Umetrip: "Iron Throne King": 123 flights totaling nearly 170,000 kilometers, surpassing 99.98% of Umetrip users.

This also reflects the company's pace of development. In 2024, Momenta completed the mass production 1-to-10 phase, simultaneously mass-producing up to 10 vehicle models at its peak; entering 2025, Momenta became the first autonomous driving company in the industry to enter the mass production 10-to-100 phase.

In the autonomous driving industry, Momenta is regarded as the most successful company in pushing L2 mass production, but Cao believes that "the earliest and most successful company to transition from L4 to L2" is a misunderstanding of Momenta within the industry. "Our company has always had a two-legged strategy: one leg is mass production autonomous driving (consumer-facing products including traditional L2 and L2++ capable of city NOA functions), and the other leg is fully driverless. The mass production autonomous driving leg stepped out first and may have moved more ahead, but fully driverless has always been progressing too. Our strategy was never to do L4 first, then abandon L4 for L2—this doesn't match our company's actual development."

Recently, Momenta has been seeking benchmark cities overseas for Robotaxi pilots.

In the view of one of Momenta's early investors, this company's temperament differs from other autonomous driving firms.

As founder, Cao lacks a Baidu background or overseas study experience. In 2016, the then 30-year-old Cao resigned from SenseTime to found Momenta.

At the time, autonomous driving was an investment hotspot, with numerous startups keen to build demo vehicles for investor experiences, immersed in optimistic atmosphere that large-scale deployment was just around the corner.

Momenta was the exception. For two years after this investor's investment, he never rode in a Momenta demo vehicle. In his view, Momenta's goal was crystal clear: first serve automakers to refine technology and accumulate data, then achieve scaled L4. "They were extremely focused on building the data flywheel, almost resembling a data company," the investor said.

As Momenta's helmsman, Cao—who studied at Tsinghua—still maintains habits from his Tsinghua mountaineering team days: set goals, team collaboration, steady progress. When feeling pressure, his approach is to rush to the front lines to solve problems.

If there's any change over eight years of entrepreneurship, it's his transformation from technical person to business manager, and the tremendous shift in the company's overall cultural orientation—"from a loosely structured research institute to a customer-value-centered team that can fight hard battles and win them."

He doesn't deny experiencing torment and confusion, but believes this is the inevitable path to building a great company.

According to ResearchInChina data, from January to October 2024, Momenta held 60.1% market share in the city NOA third-party intelligent driving market, leading Huawei (Huawei Hi mode), Baidu, and others.

When it comes to competitors, Cao identifies Tesla. How to win in competition? "Further develop the practical experience and corporate culture accumulated through fighting and winning hard battles, and on this foundation, innovate more—and dare to innovate," he said.

In early January, China Entrepreneur had an in-depth conversation with Cao, who shared his eight-year entrepreneurial persistence, costs, and personal growth, along with his industry observations.

Below are highlights:

  • Public tolerance for Robotaxi safety isn't that high. Only when the safety ceiling is sufficiently high can scale expand.

  • Achieving scaled L4 through owned fleets is extremely difficult; only through mass production autonomous driving, selling 10 million vehicles, is it possible.

  • For repetitive tasks, standardization, processization, and automation are essential.

  • When an organization introduces new concepts, there will always be 20% who actively embrace them. If you can unite this 20% to build benchmarks and provide corresponding value returns, others will follow.

  • Original intention matters. It's not about seeking fame and success (to start a company)—simply, you genuinely love this thing.

  • The most important thing for a company is creating value for customers. When making choices, always be customer-value-oriented.

  • A vague culture and vague strategy become concrete actions, these actions convert to results, and these results become employee returns—this is certainty.

01

On competing with Waymo: "Don't you think this is incredibly exciting?"

Q: What goals did you set for Momenta in 2024?

Xudong Cao: Reach for the sky, cover the earth.

"Reach for the sky" refers to technology. In 2024 for mass production autonomous driving, it was mainly mapless city NOA, and we indeed achieved industry-leading levels in mapless city NOA.

"Cover the earth" means more customers, more vehicle models in mass production. This actually corresponds to very strong capability: how to use one main line to deliver and adapt with high quality and speed for different customers and different vehicle models. This is extremely difficult—it can't be solved by throwing people at it. It requires a strong system that automatically handles much of the delivery and adaptation work. Building out this delivery system is crucial.

Beyond "reach for the sky, cover the earth," we also have the flywheel system, which may be the most important system building behind everything. Internally we call it "build system, get results." The 0-to-1 phase is get results, build system; but the 1-to-n phase is build system, and through improving the system, better get results.

Q: Some autonomous driving companies firmly pursue the L4 path, while Momenta is better known for its L2 business—why this choice?

Xudong Cao: Our ultimate goal is achieving scaled L4, which requires data-driven automated problem-solving. Our calculations show we need 100 billion kilometers of data, equivalent to what 10 million passenger vehicles accumulate in one year. Therefore we determined that achieving scaled L4 through owned fleets is extremely difficult; only through mass production autonomous driving, selling 10 million vehicles, is it possible.

Q: What problems did you encounter in the transition from L4 to L2?

Xudong Cao: First, let me clarify again: our company has always had a two-legged strategy—one leg is mass production autonomous driving, the other is fully driverless. The mass production autonomous driving leg stepped out first and may have moved more ahead; fully driverless has always been progressing too, equivalent to our second step, and has always been our strategy. Our strategy was never to do L4 first, then abandon L4 for L2—this doesn't match our company's actual development.

But I don't know why, there's this narrative in the industry, and it spread especially widely. Many people say we're the earliest company to transition from L4 to L2, and also the most successful at transitioning to L2. Actually that's not the case. From day one, our company believed that mass production autonomous driving is the inevitable path to scaled L4.

Mass production autonomous driving is difficult in many aspects. First, scaled mass production and several hundred vehicles, 1,000 vehicles—the problems encountered are different. The latter has limited problems, but scaled mass production encounters all kinds of long-tail problems, and the order of magnitude of problems to solve becomes much larger.

Second, mass production vehicle costs are sensitive, so the sensors and computing units you can use have significant cost constraints. You need to "dance" within these considerable cost constraints to create good product experiences and good technology. This is also very challenging to one's foundation.

Third, there's major challenge in how to collaborate with automaker customers commercially. Because many people at our company come from the internet sector, there are differences with the automotive industry in thinking methods and work habits. How to minimize potential problems and maximize value created? These are challenges in the commercialization process, which we've solved well.

Q: How do you prove the L4 business model can work?

Xudong Cao: I think the L4 Robotaxi business model is relatively clear—whether Baidu or Waymo, they've already fully validated this model.

Why the business model can't scale quickly now, I think the bottleneck is mainly technical safety. If this technology were already very mature, with very high safety, I believe it could scale quickly. Because the business model is profitable, and very profitable.

The public's tolerance for Robotaxi safety isn't that high. With human taxi drivers, you might see one serious or fatal accident per year across a fleet of 1,000 vehicles. But if a Robotaxi company had one serious or fatal accident per year across 1,000 vehicles, the company might have to shut down.

Source: Visual China

So the core issue is still technology and safety. Only when the safety ceiling is high enough can you scale meaningfully.

Q: What are Momenta's next priorities for its L4 business?

Xudong Cao: We have three critical goals for 2025. First, achieve fully driverless operation. Second, turn gross profit positive on a per-vehicle basis. Third, expand overseas.

Right now, much of the industry has negative gross margins, and the fundamental reason is that the vehicles are too expensive — maybe one or two million RMB, or at best several hundred thousand. With those numbers, your gross margin is negative, and the larger you scale, the more money you lose.

Our company isn't one that burns cash to build demos and raise funding. We're a very pragmatic company. If we're going to scale, we need to get per-vehicle gross margin positive first, then expand from there.

For our third goal, beyond operating in China's benchmark cities, we'll likely select at least one benchmark city globally to pilot our Robotaxi. Our overseas expansion also represents Chinese technology going from China to the world.

Q: Will you be competing directly with Waymo overseas? How do you capture market share, and what's your advantage?

Xudong Cao: We will. Don't you think that's incredibly exciting?

Compared to Waymo, we have two very important advantages. First, we leverage mass-production sensors and compute hardware, so our per-vehicle cost is significantly lower than Waymo's.

Second, our technical approach. We're running a mapless solution, a large driving model solution, which lets us rapidly adapt to different cities and countries.

Of course, Waymo is very strong too. Its advantage is more than a decade of accumulated experience in Robotaxi technology and operations, which is something we need to continue learning from and catching up on.

02

Setting Benchmarks on the Right Philosophy

Q: Why did you come up with the "one flywheel, two legs" strategy?

Xudong Cao: In the second half of 2018, the company was expanding very quickly. We grew to 400 people, but operational efficiency wasn't meeting expectations. When I talked with people on the front lines, I discovered a huge gap between the company's core philosophy and what frontline employees actually understood, which led to misaligned actions. At the end of 2018, I decided to express our strategy in something as vivid as an advertising slogan.

Q: Compared to other founders, you seem to place unusual emphasis on company "slogans." Why?

Xudong Cao: I think distilling and summarizing is a good habit, and it's permeated much of our work. For example, if we create a group chat at our company, it has to have a name — no name, no group. Because the name defines the group's purpose, and we require the name to be concise. You should know what the group does just by reading its name.

Another example: when writing documents, one document should have one theme. Don't make it a hodgepodge of four or five topics. And the document title should accurately convey what the document is about.

Based on our experience, to make important things clear, with well-defined responsibilities, and move them forward quickly, you have to give fuzzy things a name first. Even if the name isn't perfect, we can revise it. Once it has a name, everyone has a shared reference point and will discuss it regularly. But if something remains nameless, in a chaotic state — this person describes it one way, that person another — it's very hard to clarify and advance.

I've found that human civilization has advanced through naming things. The Inuit, for example, named seven different colors of snow to facilitate their daily lives — some colors suitable for building houses, others for boiling water or preserving food.

Q: Did you learn this after starting the company?

Xudong Cao: I learned it continuously through the entrepreneurial process, and it may also relate to my personality. I took the MBTI test recently — I'm an INTP, and the corresponding profession is "logician." Also, I believe a crucial capability for executives is connecting theory with practice, putting summarized experience to the test in reality. Experience that helps people win battles and overcome tough challenges is good experience.

Q: In 2018, the company also shifted from a research-oriented organization to one focused on customers and products. How did that transformation happen?

Xudong Cao: These are two dimensions — one technical, one product positioning — and the shift manifested in personnel composition and company culture.

We were founded in 2016. By the first half of 2018, the company still resembled a research institute somewhat. At the end of 2018, we proposed a new company culture: oriented around customer value. This became our standard for evaluating employees. If what you built helped create customer value, you'd get more resources, a larger team, and better odds of promotion.

Q: What did that period cost you?

Xudong Cao: The cost was significant turbulence. We were nicknamed the "Whampoa Military Academy" of autonomous driving. The transformation started in the second half of 2018 and lasted roughly half a year to a year. There was a lot of ideological conflict during this time. You'd see people clashing in meetings, some leaving in anger, others quietly persevering.

Q: As founder, what did you do to build internal consensus?

Xudong Cao: To be honest, I wasn't very experienced at that point, and I didn't handle it particularly well.

I think the one thing I got right was this: despite the turbulence, resistance, and various dissenting voices, I insisted that the company's value orientation be centered on customer value. If you could create value for customers and users, you'd get more resources and room to grow. If you weren't creating value, no matter how smart, hardworking, or innovative you were — sorry, your efforts and innovations weren't in the right direction, so you'd get fewer resources and less upward mobility. I think persisting in this principle was important.

Of course, looking back, there were definitely better ways to handle it.

What I could have done better was to set benchmarks on the right philosophy. When a new philosophy is introduced in an organization, there's always about 20% of people who actively embrace it. Unite that 20%, create benchmarks across the company — whether benchmark products or benchmark technologies — and give those people good rewards. Then the middle 60% will see and follow, and eventually the bottom 20% will follow too. This is very important.

At that time, I didn't really understand this. I wanted to push everyone in that direction at once, essentially dragging or pushing people along regardless of whether they understood it. That created a lot more resistance and pain.

Xudong Cao. Photo: Deng Pan

Q: What specific benchmarks did you set at that time?

Xudong Cao: Let me give a product benchmark example. We had an R&D team in Beijing, and for easier product development and road testing, we also established headquarters in Suzhou. Suzhou is in the Yangtze River Delta, closer to customers. At that time, some colleagues who had stronger conviction about creating customer value came to Suzhou with me, and together we developed the first product prototype for mass-production autonomous driving, Mpilot. We were incredibly energized. Everyone saw what the first leg of our "one flywheel, two legs" — mass-production autonomous driving — looked like in prototype form.

I think this was very strong positive motivation for colleagues who embraced the cultural philosophy.

Once benchmarks are established, innovation stops being scattered in all directions and becomes more focused on products and user value.

Q: What impact did that round of adjustment have on team cohesion and confidence?

Xudong Cao: It absolutely strengthened them. This is why I believe any great company goes through transformations like this more than once. For a company to reach a new level, it needs to go through a transformation like this. The more times you go through it, the stronger the organizational cohesion becomes, the more the team becomes one that can fight hard battles and win, and the company enters a new phase.


Every Employee Is an Architect

Q: It's been about four years since the 2020 adjustment. What other changes has the company undergone?

Xudong Cao: Take our commercialization as an example. The main thing is that mass-production autonomous driving has gone from 0 to 1, 1 to 10, and this year we're entering 10 to 100. We've now secured more than 70 mass-production design wins, with several dozen mass-production vehicles in simultaneous development. We're probably the only company in the industry to have completed 0 to 1 and 1 to 10 and entered 10 to 100 — able to develop several dozen vehicles in parallel while guaranteeing quality and timelines.

Q: What does 1 to 10 mean? How was it accomplished?

Xudong Cao: 1 to 10 is actually straightforward — mass-producing 10 vehicle models, and not doing them sequentially but in parallel. Parallel execution creates enormous difficulty, because every automaker has different requirements, every vehicle has different hardware configurations, every vehicle has its own differences.

Q: How does Momenta manage to mass-produce so many models simultaneously?

Xudong Cao: We have one main thread and three treasures. The three treasures are Momenta Framework (a tool ensuring consistency in underlying algorithm frameworks between L2 and L4 products), Momenta Adaptor (an adaptation tool designed for different OEM requirements), and Momenta Box (a development kit based on multiple mass-production projects, aimed at solving the problem of software needing to run on unstable hardware and deliver on time).

During the 0 to 1 mass-production phase, we already anticipated the challenges of 1 to 10 and 10 to 100, so we developed the "three treasures" in parallel during 0 to 1 to prepare for mass production.

We've also achieved one main thread, converting different customer demands and pressures into drivers for our main thread's development.

So during mass-production development, we didn't simply adapt our existing main thread to different vehicle models and call it done. When different models ran into issues, or when different customers came to us with good requirements, we fed that back into the main thread and solved each mass-production problem by evolving the main thread itself. This gives you real economies of scale: the more customers and models you have, the faster your main thread develops. But the moment you approach mass-production delivery with an adaptation mindset, you'll find the opposite happens — more customers and models actually slow down your main thread development. Because more customers and models scatter your resources, leaving less for main thread development.

The concept sounds simple, but execution is extremely difficult. Getting different customers and different models to all feed into the main thread places enormous demands on architecture — that's what our Framework is for.

Then there's Adaptor and Box — both incredibly useful in mass-production development. Adaptor can automatically adapt the main thread to different customer requirements and different vehicle hardware configurations with virtually no human involvement.

We want every employee in the company to be an architect, not someone solving problems manually. First you solve one or two small problems, develop deep understanding of them, then you organize a team to architect a set of tools, and use those tools to automate the solution.

Anything repetitive must be standardized, process-ized, and automated.

Q: What was the biggest challenge during mass production?

Cao Xudong: I remember our first product going into mass production was pretty tough. It happened to be during the pandemic. One of our company's core values is customer-centricity. To ensure we could deliver on mass production, I went to Shanghai with many of our VPs and employees. We lived near the customer's offices, getting off work at one or two in the morning on good days, three or four on bad days, fighting shoulder-to-shoulder with the customer.


Eight Years of Entrepreneurship: From Loose Research Institute to Battle-Hardened Team

Q: You studied engineering mechanics as an undergraduate at Tsinghua University. Why did you switch to AI?

Cao Xudong: Engineering mechanics was originally called the Department of Mathematics and Mechanics — I applied because of my strong interest in math and physics. Later, in my sophomore and junior years, I encountered statistics, learning how to extract knowledge from data. At the time, statistics was just a branch of AI. By my first year of PhD studies, I found my personal ambition shifting from pure physics toward AI. I believed this was where my true passion would lie for decades to come. So I chose to drop out and joined Microsoft Research, where I worked for five or six years before starting my own company.

Q: Was it an easy decision to drop out?

Cao Xudong: Honestly, no. There was internal struggle. At that point, AI hadn't seen any technological breakthroughs. Jobs weren't necessarily easy to find, and there were very few products or industry applications possible. But I found myself feeling fulfilled every day, feeling my inner self growing stronger and my life becoming richer — things others couldn't see. I think this was crucial, and an important reason that sustained my choice.

Photo: Deng Pan

Q: Why did you choose Microsoft Research? How did this experience help with entrepreneurship?

Cao Xudong: At the time I believed working was the better choice, because Microsoft Research combined research with products — it wasn't pure theory. This aligned with my passion. I discovered I'm not someone interested in pure academia or publishing papers. My research style is problem-oriented and product-oriented — I prefer solving real problems and creating real value.

This experience was also hugely helpful for entrepreneurship: how to build products, how to do engineering, how to do R&D with a product orientation, and how to use data-driven approaches to build something that truly solves user problems and can scale.

Q: You came from a technical background. How did you transition into a management role in entrepreneurship?

Cao Xudong: The transformation was cumulative. Take management capability — I don't think mine was very good when I first started, and it's probably still just okay now. You have to step on some landmines, make mistakes, go through pain, reflect, and then grow. It's that kind of process.

But I think original intention matters. The motivation for entrepreneurship comes from whether you genuinely love what you're doing. Not from the promise of fame and success, not from financial returns — simply that you inherently enjoy it, that doing it brings you joy and growth. I think this is important.

Another crucial thing is creating value for customers and users. When making many choices — option A or B, support or oppose — always center on customer value. As long as you persist in this, both individual capabilities and organizational capabilities will develop in this direction.

Q: Since starting the company, what's the biggest mistake you feel you've made?

Cao Xudong: It definitely has to do with people. Once the company's strategic direction is set, there were no major errors. The fact that the company has reached this point today owes to our industry insight and strategy — the direction wasn't completely correct, but broadly correct, without too many detours.

But with people and organization, we stepped on landmines. I think whenever the company achieved good results, it had a lot to do with finding the right people and using them well. Whenever we experienced great pain or suffered significant losses, it also had a lot to do with choosing the wrong people or using them incorrectly.

Q: What lessons did you learn?

Cao Xudong: Be more cautious in hiring and using people. For example, have more interview rounds during recruitment. Take it slow with delegation — authorize after someone has won battles. Authorizing too much without market validation may look like trust in that person, but if you give too much authority or responsibility, it actually becomes too much challenge and burden for that person's growth. Also, listen widely — heed suggestions and feedback from many people, and judge from multiple dimensions.

Q: In the entrepreneurial process, how do you find certainty amid uncertainty?

Cao Xudong: You must seize certainty to respond to uncertainty.

What is certainty? Centering on customer and user value — this is definitely certainty. When you face many challenges and don't know how to solve them, focus on doing technology well, doing products well, and creating more value for users.

For example, in the short term we didn't know the specific technical path to achieve scale, but we knew we needed data-driven approaches and massive amounts of data — these were certainties beneath the uncertainty. You need to build your system on these certainties. With this system in place, you can then explore different paths. This doesn't guarantee success, but it greatly increases the probability.

Q: How do you build internal consensus through this process?

Cao Xudong: First, it's very hard to reach such consensus internally. You must leave more uncertainty with the executive team, and more certainty with frontline employees. You can't expose all uncertainty to frontline employees — that would be disastrous. The reason executives are executives is that they need to constantly isolate this uncertainty, filter out certainty, and deliver that certainty to frontline employees. This is an important part of the value that executives create.

Q: When did you feel particularly conflicted or lost?

Cao Xudong: I think probably the second half of 2018 through the first half of 2019. During that period, there was indeed a tremendous transformation in the company's cultural orientation, from a relatively loose research-institute character to a customer-value-centered, battle-hardened, victory-capable team.

This transformation was enormous, and the resistance in the process was also very great. Compounding this, my management methods at the time weren't mature — I was somewhat forcefully wrenching things from one direction to another, and there were many dissenting voices. Actually, during transformation, even if your direction is right, if your methods aren't mature or are incorrect, there will still be many failures.

When you experience many failures and hear many dissenting voices, you'll definitely feel lost. But I think it was a certain degree of lostness. Looking back now, even though I was lost, even though I experienced failures and faced much opposition, I still insisted that our organization transform from a loose research institute into a customer-value-centered, battle-hardened, victory-capable team. This persistence was important.

Q: What did you struggle with most?

Cao Xudong: I think failure itself was okay. What was more difficult was that during the transformation, some people left, some held strong opposing views, and many of these people were ones I had recruited or had been with the company since its founding. You could feel that these were all very talented people, and there were clashes. Some people even left — internally it was definitely quite agonizing.

Q: How do you relieve pressure when it's high?

Cao Xudong: Rush to the front lines and get to work. This really works. Whether doing research or solving problems at the front lines, on one hand it gives you deeper understanding of where the anxiety truly stems from, and on the other hand you can solve some concrete problems and build some benchmarks to relieve your anxiety.

At our company, the requirement for all executives is that they must go to the front lines to see and to do. Not that you have to be at the front lines for everything, but you must identify what the key problems are. You don't necessarily have to personally solve the problems — you can have the team solve them — but you must have accurate grasp of the important key points and frontline conditions.

As an executive, if you don't understand frontline conditions, especially key frontline conditions, you'll be looked down upon at our company.

Recommended Reading