A Conversation with Pollo AI's Chenbiao Zhu: A "Grassroots" Founder Without Big Tech or Overseas Pedigree
"The game has entered the second half," said Abiao. He wants Pollo AI to become the CapCut of the AI era.
This article is republished with authorization from LatePost (ID: postlate); author: Shen Yuan, editor: Cheng Manqi.
"Through pure insight into opportunities and rapid execution, they built a top-tier product." That's how Yuan Liu, partner at ZhenFund, describes Pollo AI. The original Pollo AI only offered API access to Keling AI's video generation model during its closed beta. Soon, it evolved into a POE-style platform aggregating nearly every publicly available model, providing users with image generation, video generation, and avatar creation services.
As an aggregation platform without self-developed models, Pollo AI's early growth was impressive — within just seven months of launch, its monthly active users exceeded 4 million. In Chenbiao Zhu's own words, this came from reviewing hundreds of overseas products and paying millions in tuition.
Competition in the video generation space is intensifying. At the end of September, OpenAI launched a new model alongside the Sora mobile app, a consumer-facing application that lets ordinary users quickly create and share AI videos.
Will something like Sora become the future of generative video products — an "AI TikTok"? Zhu doesn't think so. In his view, professional creative workflows and distinctive user mindshare matter more. SEO-driven growth has already plateaued, and Zhu needs to find Pollo AI's next growth curve.

The SEO Ceiling Has Been Reached
Q: As a generative video model aggregation platform, Pollo AI grew very quickly early on — over 4 million MAU in 7 months. Some say Pollo AI's SEO strategy was impressive. Was that your main growth driver?
Chenbiao Zhu: It was one of the reasons. To summarize Pollo AI's early rapid growth: product plus marketing, plus good timing. Of course, we can start with SEO. I come from an SEO background. Before 2023, every product I worked on relied primarily on SEO, so this was familiar territory.
Q: Why didn't your previous products achieve this kind of scale?
Chenbiao Zhu: We've built over twenty products. Some were fairly successful, others failed completely. But SEO has its own ceiling. When algorithm rules change, traffic can be cut in half or drop to zero — that's very unfavorable for a business. We tried many ways to break free from the constraint of relying on SEO as a single traffic channel, but none succeeded before Pollo AI.
Q: Let's finish talking about SEO first. From when you started doing SEO at Wondershare until now, how has your SEO methodology evolved?
Chenbiao Zhu: I think one major evolution has been integrating SEO into the product itself — that's a fairly significant shift.
There aren't many people in the market who manage both marketing (including SEO) and product, so many products have a very fragmented relationship between marketing and product. Since I do both, I can combine them organically. This is one of the well-known approaches to building growth into the product. Similarly, this inspires me to think by analogy: beyond SEO, what other ways can we embed growth into the product?
Another evolution is broader SEO thinking — for example, you should optimize and do content marketing around a specific audience rather than around a specific feature. If a product's target users are students, then theoretically all the information and content that student demographic needs should have corresponding landing pages.
Q: You've mentioned in other interviews that you created dedicated landing pages for every细分需求 [segmented need] of the product.
Chenbiao Zhu: Yes. But overall, the SEO methodology is largely similar. Here I have to thank my former employer, Wondershare — it really is the Whampoa Military Academy of overseas marketing. Wondershare has a complete methodology. It treats content creation, backlink building, influencer marketing, even URL structure as parts of a complete closed loop. Like puzzle pieces, it teaches you to see the full picture of marketing. You don't see this elsewhere.
Q: What's the difference between doing AI products and the software going-global you did before?
Chenbiao Zhu: I don't think there's any fundamental difference. Especially for the application layer, it's basically just a different engine underneath, an additional API. The biggest difference is that you need to view the evolution of that underlying engine with a developmental perspective, and avoid doing things that might get replaced by model advances.
For example, when GPT-4o generates product model images, character consistency still isn't great. We might choose to tweak prompts, using various constraints to ensure consistency. But then Nano Banana comes out, and everything you did was for nothing.
Q: After raising funding, will anything change about Pollo AI's SEO strategy?
Chenbiao Zhu: No. SEO is just one part of the product. Using SEO to drive growth has its bottlenecks. Generally, after about a year, there's relatively less you can do. It's hard for SEO to help us double growth anymore. Going forward, the most important things are product refinement and establishing a distinctive user mindshare.
I think product-driven growth will become more important. I'm now focusing more energy on product and talent recruitment rather than SEO. I desperately need exceptional people to join us — in marketing, product, and technology, we need them all. I'm gradually understanding why Lei Jun said he previously spent 80% of his energy on recruiting, and why Liu Bei visited Zhuge Liang's thatched cottage three times. Recruiting exceptional talent is crucial, and it's not easy.

Lessons from 20 Overseas Products: Timing and Direction Matter Most
Q: Having built 20-plus products, what's your biggest lesson?
Chenbiao Zhu: Having built 20-plus products means I've reviewed at least 500 overseas products. When you've seen enough products, you naturally know what good products look like and what foreigners like. You also know what products and directions are worth pursuing and which aren't.
First, I think timing and product selection are extremely important — knowing what product to build when. There's a saying in e-commerce that product selection determines survival or death. AI software has similarities. What you choose to do matters enormously. Product selection is actually a skill that can be developed through practice. As the saying goes, read three hundred Tang poems and you can chant even if you can't compose.
Q: Let's start with what makes good timing and product selection?
Chenbiao Zhu:
- For example, after ChatGPT launched, building a chatbot wrapper was good timing. But doing it a year later was too late.
- After Stable Diffusion launched, building image sites and image communities was good timing. SeaArt and Liblib emerged from that window. But a year later, everything became an uphill battle.
- After AI video APIs or open-source video models became available, building video-related applications was good timing — Pollo AI, for example. Of course, this is hindsight. At the time, it was accidental; I wasn't being prescient.
To abstract this: when a new technological disruption tears open the old commercial landscape, new opportunities emerge.
For example, after GPT-4o launched, you could easily edit images by speaking. Building model images and product images became simpler to achieve what previous technology couldn't. That's an opportunity. Even if GPT-4o might not have perfect consistency, we should believe models will iterate and improve. But if at that moment you cling to your original algorithms and technology to do model images, that's the wrong direction.
Q: Then what's poor product selection?
Chenbiao Zhu: For example, building product images, background removal, or poster tools a year ago would have been poor selection. There's no differentiation, and incumbents already occupy significant user mindshare. Users generally won't choose you unless you have massive differentiation or a higher-dimensional product.
I've also noticed many Chinese companies going global can't distinguish between tools and products. Many overseas products are just piles of features — you don't know what core problem they solve. There's no workflow stringing things together, just inorganic arrangements of functions. More specifically: Word to PDF, Video Download, Video Converter — these are tools, straightforward and direct. But Adobe PDF Editor is a product. It contains many plugin tools that help users better accomplish something, like editing a PDF document.
I think we should build more products, not tools.
Q: What time windows did Pollo AI capture? How?
Chenbiao Zhu: When Keling AI's API was still in closed beta, we used our connections to reach them, integrated the API, and launched. The official API launched about a month after our release. So we moved very early.
But if you asked me to build an AI video product like Pollo AI now, it would definitely be too late. Even with my SEO skills, I couldn't achieve that kind of rapid growth.
The window was very short — roughly between last September and December, when open-source video models and APIs had just emerged. They weren't perfect; the technology wasn't mature enough. But you don't need perfection to build the product.
Q: What led you to position Pollo AI as a POE (model aggregation) platform?
Chenbiao Zhu: It was somewhat accidental. The initial idea was simple: I wanted to understand what the "C" in AIGC represents. It could be text, images, video, and music. We built several products in this space. At that particular moment, video generation emerged, and there wasn't even a public API yet. We integrated Keling AI's closed beta API.
After launch, we immediately got coverage because we neither claimed it was our own model nor disclosed it was an API integration. Some American media even thought it was our own model — which shows there was still some红利效应 [first-mover advantage] at play.
Later, as more APIs became available, we decided to build a video version of the POE platform. Canva's acquisition of Leonardo.AI inspired us: multiple models plus related tools plus community — that's the Pollo AI you see today. At different stages, the game being played is different. Product form evolves with the game and with changing understanding. The next form of Pollo AI is also incubating.
Q: Have you considered self-developing models?
Chenbiao Zhu: Not for now. Besides, it's not purely about whether I want to — I don't have that kind of money to play that game right now. Even if we did, we couldn't outcompete ByteDance, Google, or OpenAI. Our capabilities have always been at the application layer.

Becoming the AI Version of CapCut: What Would It Take?
Q: From what you're describing, Pollo AI still seems more tool-oriented than a product that enables a complete creative workflow.
Chenbiao Zhu: This is a transitional state. We're in a phase shifting from tool-dominant to one-stop creative workflow.
If Pollo AI remained just a wrapper, that would certainly be dangerous. But at the end of last year and early this year, it was the optimal solution because user pain points were obvious — they didn't want to switch between various models.
Q: What has Pollo AI already done in this shift toward one-stop creative workflow?
Chenbiao Zhu: We've all had this experience: with many AI products, generated images are in one place, generated videos in another, and if you want to add effects or upscale, that's in a third place. This experience is too fragmented. In Pollo AI, everything you generate is in the same feed, the same waterfall flow. You can do secondary creation on the same page — that's a workflow, not simple tool stacking.
This approach actually came from Jimeng, which I consider the best productized product on the market. I have to admit we learned and imitated Jimeng in this regard, because I simply couldn't find a better solution. I think the product manager or designer who first proposed this interaction scheme was truly a genius.
Q: What has been the product upgrade direction these past few months?
Chenbiao Zhu: We have some ideas, but they may not be fully mature yet. For example, I think Pollo AI should shift to conversational interaction, a more mainstream form, rather than the current form-based approach, which is relatively less flexible. On another level, I think Agent should be an important module.
Going forward, users won't want to select models or switch between "card draws" [random generations]. They'll want to get video directly, skipping the selection step. That's when Agent will emerge.
Previously, the game was aggregation and wrapping — generated videos were small fragments, 5 or 10 second clips. The next stage, the game will be short films — say, a 30-second video. That's when Agent will be needed. Our thinking and understanding need to keep pace with the times. Otherwise, if we fall behind, previous advantages will be greatly diminished. Of course, we have other new innovations and iterations in development — give us some more time.
Q: Where does this shift ultimately lead? What's Pollo AI's final form?
Chenbiao Zhu: Pollo AI needs three to six months of productization, while also finding a differentiation point from competitors — a unique mindshare.
I hope Pollo AI ultimately becomes the AI version of CapCut, or the video version of Canva.
Q: Is unique mindshare the same thing as the productization upgrades you mentioned?
Chenbiao Zhu: For our product, it's the same thing.
Q: In your view, which AI products currently have unique mindshare?
Chenbiao Zhu: For example, CapCut means editing; Lovart means design. Their product mindshare is very clear. Also Photoroom for product background images. Though I don't think Photoroom is sufficiently successful or perfect. I think they should build an ecosystem for product images — large enough that users can freely choose what kind of product image to create. That would be the most powerful.

Sora Has Low Retention; Growth Through Effects Isn't Sustainable
Q: Speaking of ecosystems, Sora's mobile app seems to want to become an AI version of TikTok. What's your view on social content ecosystems in generative video?
Chenbiao Zhu: Sora App is a very innovative product. OpenAI is a company that understands both models and products. Sora mobile opened a door for video socialization — from tool to social, it's a fairly symbolic transition. But viral popularity and retention are different. Clearly, Sora's retention won't be great because its business model has massive sustainability problems. While free, its daily token cost is roughly $15 million — that's unsustainable. (Note: According to Forbes, Sora's operating costs are approximately $15 million per day.)
Q: From web to mobile, have you observed any traffic trends?
Chenbiao Zhu: Web tends more toward creation; mobile tends more toward entertainment. For video, better interaction still happens on web, because people work on computers — especially overseas.
Mobile is mainly entertainment. And I'm skeptical about retention rates for users acquired through effects. Effects are unsustainable, non-long-term. This acquisition method requires constantly finding new cohorts of users.
Q: But everyone's doing this, including Pollo AI itself.
Chenbiao Zhu: Yes, we're doing it too. But from a long-term perspective, it's unsustainable, and we're changing. For example, integrating effects into workflows. Effects drive acquisition, then professional creative workflows retain those new users.



