AI Entrepreneurship Goes Global: Open Source, Opportunities, and Challenges | 5Y Capital Tavern Vol.23 [Podcast]

五源资本五源资本·December 3, 2024

How a new generation of AI entrepreneurs can find opportunities and navigate challenges between open source and global influence.

In this episode of 5Y Tavern, two rising AI founders — Luyu Zhang of Dify.AI and Dayong Li of ChatTTS — share their entrepreneurial journeys and industry insights. They discuss how a new generation of AI entrepreneurs can find opportunities in open source and globalization, and how they've navigated the challenges of building their companies. We hope their experiences offer you some inspiration too :)



【Host】

Yaopeng Xing, Vice President at 5Y Capital

【Guests】

Dayong Li, Founder of ChatTTS

Luyu Zhang, Founder of Dify.AI

Yaopeng Xing: It's a great honor to have two rising founders in the AI Infra space with us — Luyu Zhang of Dify.AI and Dayong Li of ChatTTS. Could you both briefly introduce yourselves?

Luyu Zhang: Hi everyone, I'm Luyu from Dify.AI. Dify.AI is an open-source, enterprise-focused global platform for AI application development and operations — what we call LLMOps. We were probably among the earliest teams globally to coin this term.

When OpenAI began widely offering GPT model APIs in November 2022, we recognized a new variable. It signaled a new paradigm for AI application development and operations, along with fresh technical challenges. These challenges weren't just for traditional developers, but also for people without technical backgrounds — things like working with RAG, Agents, Tools, and so on. We also saw new opportunities in how applications get defined, and how data is collected from the real world to improve them. So we decided to build this.

We founded the company last March. In just over a year, as an open-source project, we've gained 51,000 stars on GitHub, making us one of the top five open-source projects in China. Our open-source Dify community edition has over one million installations globally. And as a very early-stage company, we've already successfully served more than 30 Fortune 500 companies and reached profitability. So across the past year or so, even among this wave of AI companies, we've been quite fortunate.

Dayong Li: We're ChatTTS, focused on speech synthesis and audio generation. Our first open-source release, the ChatTTS library, gained over 30,000 stars in just one month. We continue to develop speech synthesis technology with higher fidelity, more human-like quality, and greater expressiveness. When we released ChatTTS, we noticed that existing TTS solutions at the time didn't really offer human-like interactive capabilities. So we focused on providing more lifelike, natural elements — filler words, emotional expression — that's our main direction.

Yaopeng Xing: If you pay attention to many AI applications, you'll often find Dify and ChatTTS behind the scenes. Today they're building the innovation engine for AI applications, accelerating developers' innovation across many fields. But I know there must have been plenty of skepticism in the early days of building and growing your companies. Could you each share what the biggest doubts you faced were? What did investors ask you most often?

Luyu Zhang: The first question was clearly about competition with large companies — especially facing a talent-dense, well-funded company like OpenAI, how could an emerging open-source middleware company avoid getting swallowed up? That was the concern everyone raised.

Of course, at this point we can say we've overcome that, but it wasn't clear at the time. I think it depends on two things. First, whether the founder team and our investors could see asymmetric and non-consensus information. Through extensive contact with enterprises and developers, we found many unmet needs — multimodal models, neutrality, RAG data pipeline orchestration, and so on. Second, whether you believe in history. In software design, many historical patterns repeat, especially when a technological and interaction revolution occurs. What happened historically — the evolution of desktop and mobile operating systems, for example — may reappear at this moment. The key is whether you can see that.

The second challenge was about growth. Dify's growth and business model, which we call PLG 3.0, or PLG plus open source. Under this model, from product-market fit to massive growth and market coverage, forming a kind of technical monopoly, and finally monetizing — this is actually a second-order or third-order process. That means from the earliest stage, it doesn't look linear. This probably requires investors with patience and systems-thinking ability to understand. Another challenge is current geopolitical factors, but I see that as a minor hurdle, not a reason to stop.

Dayong Li: For us, the initial question was whether ChatTTS as a foundational capability would be replaced by larger companies. For example, when big tech builds interactive products, they'll certainly need TTS capabilities. Many investors asked us: if a large company decides to do TTS themselves, what would you do? We initially thought we had some technical lead, but that advantage would quickly narrow once big tech poured in massive resources.

But now we have a better answer. Large companies tend to focus on natural interaction and text-based information delivery, while emotional interaction and stronger expressive needs are often overlooked. The demands of more discerning creators — big tech probably won't prioritize or address those. If you haven't actually worked in audio, or don't have a professional team with experience in voice acting or sound effects, you might not even recognize these needs. If big tech enters directly, they'd need to invest much more time and effort to understand users' real needs. So we've spent considerable effort finding the right talent and talking to gaming industry professionals to understand their most urgent needs. We want to invest more in these areas and spend more time serving them.

Yaopeng Xing: What Luyu just shared was very enlightening. I'd like to follow up with you, Luyu — when breakthrough AI technology emerges, you mentioned information asymmetry. How do you discover and identify these disruptive signals, and how did they drive you to start this company?

Luyu Zhang: I've always believed entrepreneurial opportunities come from asymmetric information. Asymmetric information is hard to obtain in the public domain, especially from a top-down macro perspective, but it's everywhere in micro domains. This information may exist in your own mind, among people around you, and in those you serve. Personally, I think the first question in entrepreneurship is figuring out which group of people you're serving — who are your target users — then immersing yourself among them and constantly communicating. That's the best way to gain asymmetric information.

Yaopeng Xing: Speaking of both of your business models, they're both based on open-source foundational models. As we know, open source has formed a very powerful ecosystem network in Silicon Valley over decades. Compared to American open-source companies, what opportunities and challenges do Chinese companies face in doing open source?

Dayong Li: Much of today's model validation is essentially a centralized task, meaning you need to collect large amounts of data and validate and analyze it on a specific platform. This differs from traditional open-source communities where validation happened in every small component.

So my understanding of some open-source approaches, like LLaMA, is that they open-source their foundational models to attract feedback from many independent developers. These developers might experiment with small amounts of data or data from small companies — this is actually a great opportunity to discover market needs.

At the same time, from previous experiments we can observe that models have emergent capabilities. When many tasks are concentrated in one model, that model becomes more powerful — not just simple linear stacking. So personally I believe that in model training open source going forward, companies will first release a foundational model for users to try at small scale. Then as the model-owning company, you can observe where people are investing the most effort, integrate similar data, and retrain a more powerful model. This creates a virtuous cycle and better serves user needs.

Luyu Zhang: Open source is a very important strategic decision for Dify. It helps us solve several problems. First, it facilitates smooth global expansion and rapid market share growth. Second, open source makes users feel we're trustworthy. When facing many enterprises, we can establish our credibility without complex proof processes. This has laid good groundwork for Dify's rapid spread in Japan and several other markets today. You may know that Japan is a very traditional market where operating in a trust-based society is relatively complex.

We believe that in the Infra space, most of the world's important open-source products are borderless — they're an international network. I believe if you look at the top ten developer tools or open-source projects globally, in many cases you might not know which country produced them. Of course, Chinese teams are somewhat special, and we've made proper arrangements in our corporate structure.

Regarding China versus US open source, I think there are several advantages to doing open source in China. First, our user base is very large. If you look beyond Dify, at LangChain for example — that's an American company, yet Chinese users account for 40% of its user base. That's a very high proportion, reflecting that China is at the global forefront in generative AI application development. That's China's base advantage.

Second, from what I've seen, Chinese enterprises are relatively pragmatic in AI R&D budgeting and production deployment — willing to invest money and talent. Meanwhile, China itself has some models and upstream-downstream ecosystems. In terms of application deployment, Chinese teams overall move faster.

We certainly have some disadvantages. For example, in building upstream-downstream partnerships with North American and local traditional open-source companies, the information friction and resistance we face may be somewhat greater. But I think as our own team gets stronger, this shouldn't be too hard to overcome.

Yaopeng Xing: Luyu shared a lot about his practices across global markets, so I'll follow up with Dayong: What positive commercial and product feedback has your open-source community brought you?

Dayong Li: From a commercial perspective, after open-sourcing, many enterprises proactively reached out to us. This serves as a form of marketing, and also helped us discover some business opportunities we hadn't previously noticed. For a simple example, customer service companies might need our TTS to have very high accuracy, but in certain cases they might need to sacrifice some expressiveness. We can then do additional training and adjustment of the model for their needs, better serving these B2B enterprises.

Luyu Zhang: I believe open source can accelerate our becoming some kind of global standard — that's our highest goal. In practical terms, we've gained massive user feedback. We have over 600 contributors globally, with twenty to thirty business leads coming in daily from different companies across various countries. For example, in Dubai we have almost zero customer acquisition costs — that's hard to imagine in traditional B2B or B2C businesses. Zero CAC means your gross margins can be very high. Your team doesn't need a large sales force; you just need to clearly articulate your product's value proposition to customers.

Yaopeng Xing: Luyu, many people know you're a young serial entrepreneur, and they're curious about your new company and new journey. What have been the important growth moments and transformations in your entrepreneurial experience?

Luyu Zhang: Yes, I've been involved in many startups. Dify is my second time leading a company myself. I think the most important thing is bravely taking the first step. Going out doesn't just mean starting a company; it also means setting a high standard from day one — building a global enterprise. Bravely taking that first step is the hardest. Second, I think an important transformation is that I must delegate my own capabilities to the team, making everyone a super-individual. One super-individual is useless; we need to build a relatively decentralized organization. I've been reading a book recently called Reinventing Organizations, which proposes a concept called "teal organization" — roughly this idea. I'm very focused on whether such an organization can maintain its current innovative capacity after achieving scale and globalization.

Yaopeng Xing: Dayong, this is your first time founding a company. Could you share your growth takeaways from this experience?

Dayong Li: Yes, I was previously an algorithm engineer, so this is my first startup. In this process, there are at least a few experiences that left deep impressions. The first is team building. We found that when building a team, finding people passionate about audio matters more than finding people with impressive titles. These passionate people can identify where big tech falls short, or realize that big tech hasn't given them opportunities to build these things. Big tech may need your models to be more stable, with zero errors allowed, whereas when starting up, we don't face such constraints.

Second, I think collaboration is very important. Currently in the speech model space, we haven't yet seen an open-sourced large model with broad application. On one hand we're developing our own models; on the other we're doing enterprise deployments. Without reference cases, we collaborate with some image processing companies and former NLP companies — they have better data, share some already-parsed data with us, and we empower them with our technical capabilities. Through this collaboration, we can all move faster.

Giveaways

What are your thoughts on AI products? Share your views in the comments, and we'll select two featured comments to receive a 5Y Coffee gift.

5Y Capital seeks, supports, and inspires lone entrepreneurs, providing support from spiritual to all operational aspects. We believe that if the "crazy" you in others' eyes starts to be believed in, the world will become a different place.

BEIJING · SHANGHAI · SHENZHEN · HONG KONG

WWW.5YCAP.COM