Born Global: Their Opportunities and Challenges | 5Y View x PingCAP x Zilliz

五源资本五源资本·December 7, 2023

Where Does China's Opportunity in Global Software Lie?

For founders building foundational software, going global isn't just a strategy — it's closer to a calling. In this conversation, two pioneers exploring the global path — Dongxu Huang, co-founder & CTO of PingCAP, and Charles Xie, founder & CEO of Zilliz — discuss their journeys with 5Y Capital partner Kai Liu.

They've all stumbled, pivoted, accumulated experience, and learned hard lessons. At the time of this conversation, both founders were actively expanding their global operations from Silicon Valley. As they both emphasize, the key to global strategy is simple: "The core team and founders must be on the ground."

This is an adventure with grander vision and a harder road. Both PingCAP and Zilliz "decided to build a global company from day one." They share many insights in this discussion — hopefully some will resonate with you :)

Guests for this conversation:

Dongxu Huang, Co-founder & CTO, PingCAP

Charles Xie, Founder & CEO, Zilliz

Kai Liu, Partner, 5Y Capital

01 From Day One, Deciding to Build a Global Company

Kai Liu: This year, many enterprise software founders have been converging on the topic of globalization. Coincidentally, both of you happen to be expanding in Silicon Valley right now. Dongxu, Charles — please introduce your companies.

Dongxu Huang: I'm the co-founder and CTO of PingCAP. We're a distributed database company, part of the data infrastructure stack, focused primarily on online transaction processing and real-time analytics. In recent years we've been actively expanding into overseas enterprise markets.

Charles Xie: I'm Charles Xie, founder and CEO of Zilliz, a vector database company. Our long-term mission is to build an unstructured data processing platform and contribute to the democratization of AI data infrastructure. We started working on vector databases in 2018 — among the earliest explorers globally. Today Zilliz has 7 million downloads and installations worldwide, with over 10,000 enterprise users.

Kai Liu: Pretty much every company is talking about globalization these days. Is "going overseas" (出海) the same thing as globalization? What differences are there in strategy and execution? Are you two testing the waters, or going all in?

Charles Xie: Going overseas and globalization both ultimately target global markets, but from a company's starting point there are distinctions. In my narrower definition, "going overseas" refers more to companies that have already achieved success in China and are now expanding globally — setting sail, so to speak. On the other hand, an increasing number of new-generation Chinese companies may be founded anywhere in the world, with customer acquisition, product focus, and commercialization paths that are inherently global from the start.

So the starting points differ. For us, from day one — when we were still a tiny team in a small Shanghai office — we decided to build a global company serving global users. Of course, this path has involved tremendous hardship and plenty of mistakes.

Dongxu Huang: My understanding is similar to Charles's. The key is the company's DNA and the founder's self-perception — whether you decide from day one that this will be a global company.

I've actually never liked the term "going overseas." For enterprise software or foundational software companies, the market, product, and team management approaches are completely different in each country. "Going overseas" implies leaving from one place while keeping your roots in China. We often see well-known large companies whose overseas operations feel awkward. I once visited one company's overseas R&D center — the moment I walked in, except for the receptionist, everyone was speaking Chinese. It felt no different from being in China. That mindset makes it hard to do overseas business well. If you truly want to globalize, it's essentially about localization abroad — you need to become a local company there.

02 Don't Hesitate, Be Braver

Kai Liu: Charles mentioned that globalization has cost everyone significant tuition. Where have you paid tuition, and have you found a playbook yet? Who do you see as the benchmark for Chinese software globalization, and what can be learned from them?

Dongxu Huang: The first tuition payment, I think, was staying in China too long without getting out earlier to build teams and understand local markets locally. PingCAP was founded 8 years ago. Though we aspired to be global from day one, we only ramped up overseas investment after our product had matured and we'd gained some commercial traction domestically — so the subsequent transition was quite painful.

When you want to go global, if as a founder you're still physically based in China, your energy allocation and resource investment will naturally flow toward what's familiar around you. Without shifting this mindset, your product investment and organizational talent structure will be completely different, wasting considerable time.

I don't have a playbook, but there's a good metric. For enterprise service companies operating overseas, especially in infrastructure like ours, there's an important metric called the Rule of 40. Strong performance there means you're on track. As for reference companies, we used to learn from Huawei frequently — among Chinese companies overseas, Huawei's approach feels genuinely international: boldly hiring locally, following local rules of the game, while integrating the best aspects of Chinese collectivism.

Now my mindset is: I'm a local. The best practices to study are your local peers — see how these companies operate. We're learning every day.

Charles Xie: I believe Chinese enterprise software globalization has only just begun. Everyone is still exploring; there may not yet be a company that has completed a full cycle. In consumer software, I'd define TikTok as a "going overseas" company. In globalization, Webull has done relatively well — it was targeting overseas markets from day one.

I think we're all still stumbling through the exploration phase without a playbook, but there is a "book of don'ts." As Dongxu said, first: go all in. The core team and founders must devote most of their energy and time to the primary market. If you yourself aren't in that market, you won't have the right intuition, and the entire team's grasp of the market and customers will likely suffer missteps.

Second, make a clear choice about where your globalization focus lies — the faster this choice, the better, and once made, don't hesitate. Looking back, if we'd made our globalization choices more boldly, our current position would likely be stronger.

Also, respect objective patterns of development. When I returned from the US to start my company in 2017, from company formation to hiring to establishing systems and rules — we basically made every mistake imaginable. With these lessons, we hoped to make fewer errors entering the US market. But I found that despite having spent over 10 years in the US previously, when it came to actually building a company, we repeated all the mistakes made in China. For example, evaluating good sales and marketing people — the criteria are completely different in China versus the US, in terms of measuring whether someone fits your needs. So don't fear mistakes, don't hesitate, charge to the front. You'll inevitably repeat certain mistakes once — it's unavoidable.

Kai Liu: There's one crucial point that closely resembles early-stage investing: finding conviction. Both of you mentioned wanting to build a global company from day one. How was this conviction continuously built? Having gone through a period of globalization, how do you build the team and company, and how do you get employees to understand this conviction?

Dongxu Huang: Building foundational software is extremely slow work. From writing the first line of code to having a product you can actually sell, several years may pass — this is true whether in China or the US. So you need to focus on different things at different stages.

I tend to look at data. Early on, our targets centered on engineers and community ecosystem. The logic is straightforward: our ultimate users aren't CEOs or CIOs, but developers and engineers. So we capture engineers' needs, gradually combining top-down and bottom-up approaches, to establish a foundation for scalable effects. Only with this foundation can you discuss business models and everything that follows.

So early on, growth matters more. Mid-to-late stage, you must focus on how to commercialize. In the past two years observing the North American developer tools track, my sense is that many pre-Series B companies have been extremely successful — everyone seems to have discovered the key to developer tools: delight developers. Their growth is genuinely impressive.

PingCAP has actually moved further ahead. Now we need to figure out profitability, so we're facing a transition more painful than people might imagine. I believe these currently very cool companies will face the same thing when they need to commercialize — pursuing profit and customers remains highly challenging. However, the depth of customers and market in the US is also tremendous.

Charles Xie: As I mentioned, our company had to be global from day one. The reason is simple: I asked our team members if they wanted to be the global leader in this space. For a commercial company aspiring to lead its field, you must capture the largest market share. Looking at financial reports and industry analyses of several global foundational software leaders, it's easy to discover: this track generates tens of billions of dollars annually globally, while China accounts for less than 5%. North America, the largest market, takes 35-40%. Europe roughly 15-20% or more. Japan about 10-15%. Other markets combined perhaps another 10-15%. So even if you become #1 in China with 40% market share, 40% × 5% equals just 2% globally.

Working backward from the end: if you want to build a global leader, globalization is a necessary condition. But not sufficient — there's still luck and external factors, the right timing, place, and people. Starting vector databases five years earlier might have made you a martyr, and your global dream would have collapsed. But the more necessary conditions we can satisfy, the closer we likely get to success.

Kai Liu: Excellently put. Find fields where you can strike oil, rather than endlessly toiling in barren fields. The end-in-mind strategic approach is crucial.

The current wave of AI has attracted massive attention in both Silicon Valley and China. Is AI an accelerant or a destructive challenge for you? Facing both globalization and AI transformation simultaneously, how should these dual challenges be addressed?

Dongxu Huang: My judgment is this: AI will have a hard time disrupting databases. AI and the large language models we use now are fundamentally still games of data. However data is stored and processed, databases will always exist — so there's some job security there.

I think no company should assume this doesn't concern them; it absolutely does, massively. The key is how to use this technology to arm yourself, as a means to improve your product. Over the past half year or so, we've spun up a dedicated team using Gen AI capabilities to improve our algorithms and products.

There's also an interesting phenomenon: in Silicon Valley right now, if you don't present yourself as AI-related, people may not even be interested in talking to you. However, while everyone seems to be discussing AI, looking at what vendors actually display at trade shows and the topics they discuss, most remain focused on go-to-market and product — not much about AI, let alone complete transformation into AI companies. But AI is an excellent tool, and you need to prepare for the future.

Charles Xie: This wave of AIGC and large language models has directly brought vector databases into the view of millions of developers — we're direct beneficiaries. Previously we were basically only in the toolbox of the top 1% of AI engineers globally. But over the past year, the largest population of engineers — application engineers, web developers, mobile app developers, and product managers — they've learned that building LLM applications may require vector databases. In the six months since commercializing, we've seen user demand grow rapidly. We set a sales target at the beginning of the year; three months later, we boldly doubled it. So the AIGC explosion has been favorable for us. But long-term, I believe the AI data infrastructure track still faces significant challenges.

The biggest challenge is that over the past half year, many developers encountering vector databases may have only used them superficially without truly understanding what vector databases mean. Long-term, they don't yet understand the value of vector databases — one could even say they severely underestimate it.

In my view, whether ChatGPT or other large models, the essence is combining countless small models from the natural language processing field into one super-large model that exhibits emergent intelligence. Using one large model to more efficiently solve complex problems that previously required dozens of small models — essentially, this provides a unified text and semantic representation in natural language processing.

In AI, there's not just text and NLP, but also computer vision, autonomous driving, and more. Additionally, increasingly many recommendation systems are driven by deep learning and AI; synthetic biology and AI drug discovery also have many problems requiring AI algorithms. I believe in coming years, these fields will produce more intelligent, complex, and capable foundation models. Since Zilliz launched its vector database in 2018, there have been countless applications in computer vision, NLP, recommendation systems, risk control, AI drug discovery, and other fields. If foundation models emerge across these domains, vector data, vector semantic representation, and unstructured data processing as a whole can reach a new level.

Dongxu Huang: I have a question for Charles. From a technical standpoint, vector database companies have captured a wave of attention, but you face two choices. First, go deeper to solve more fundamental problems. For instance, scalability at larger data scales — ultimately still a database problem: storing affordably, accessing cost-effectively, running fast. The second path goes upward, toward applications and becoming more of a solution. Which direction do you think vector database companies will take?

Charles Xie: Dongxu raises a fundamental question in the database field. We started building vector databases in 2018, open-sourced in 2019, and by 2020 had nearly 1,000 users globally with a thriving community ecosystem. To quickly validate the market, we built a relatively thin, lightweight vector database — Milvus 1.0. It was small but beautiful, with basic functionality and less code, yet already attracted many users.

But in the second half of 2020, through continuous refinement, we discovered a pattern: users' unstructured vector data was growing extremely fast. Because 80% of the world's data is unstructured — images, video, audio, etc. With AI, this data can be mined for value, and data volume surged dramatically. So at the end of 2020, we made a crucial decision: rebuild our already successful and popular Milvus 1.0 vector database product from scratch. We started from the first line of code, determined to build a production-grade database system trained through the Shaolin Yijin Jing.

This pit we jumped into took two years. In retrospect it was a brilliant decision, but at the time many disagreed, including considerable internal skepticism. Version 1.0 had been developed for a year and a half, user numbers kept growing, feedback was positive — then the CEO says we need to rebuild the product in a different programming language. Everyone found it hard to understand. That decision was genuinely difficult, but looking back, we spent over two years stabilizing 2.0. Now nearly 80% of our 2.0 users are from overseas, and data scale can expand from hundreds of millions to tens of billions of vectors. So when building database products, never be satisfied with tens of thousands of lines of code. In my view, without millions of lines of code, you can't call it a database system — call it a demo. This is a necessary condition. Today, looking at the entire market, we're cloud-native, fully distributed, horizontally scalable, delivering higher performance and better cost-effectiveness in the cloud.

Returning to Dongxu's question: starting in the first half of 2023, we also saw that this wave of AIGC technology provides more accessible tools for emerging AI developers. Many developers come from product management, mobile development, or web development backgrounds — they may not know much about databases, have no database operations experience, let alone vector database experience. So we're also moving closer to application developers, providing them with better APIs.

Kai Liu: Thank you both for sharing, and thank you for breaking trail for Chinese enterprise software going global. I hope you'll return to share more experiences and lessons.

Interactive Giveaway

What are your thoughts and comments on their conversation? Share in the comments section. We'll select 2 featured comments to receive a mystery gift package from 5Y Capital portfolio companies :)

5Y Capital seeks, supports, and inspires lonely entrepreneurs, providing support from spirit to all operational matters. We believe that if the crazy you in others' eyes begins to be believed in, the world will become refreshingly different.

BEIJING · SHANGHAI · SHENZHEN · HONG KONG

WWW.5YCAP.COM