Opening the "Black Box," Building In-House Models, and a Chat About AI Entrepreneurship and Creation | 5Y Pub Vol. 22 with Yuan Xingyuan of ColorfulClouds Technology
How can we get AI to create works on par with *The Three-Body Problem*?

For this episode of 5Y Capital's "Tavern," we invited Xingyuan Yuan, founder and CEO of ColorfulClouds Technology. The company has three products — ColorfulClouds Weather, ColorfulClouds Translate, and ColorfulClouds Xiaomeng — all AI-native apps with tens of millions of users.
Two years ago, Xingyuan shared his entrepreneurial journey on this show. Since then, ColorfulClouds has made considerable strides in using AI to improve people's lives, and we've seen many new developments with Xiaomeng. Xingyuan also shared some of ColorfulClouds' explorations in AI — we hope you find it inspiring :)

[Guest]
Xingyuan Yuan — Founder and CEO, ColorfulClouds Technology
[What You'll Hear]
00:55 Starting with computer vision technology, developing China's first high-precision weather forecasting software
03:04 Investing in foundational model architecture research, developing DCFormer — a model outperforming Transformer
04:40 After ChatGPT's emergence, how AI entrepreneurs responded, and the two directions ColorfulClouds pursued
10:40 Going all-in on model architecture research and development — what bottlenecks and difficulties they encountered
19:23 Choosing the harder path, and hearing plenty of skepticism and opposition along the way
20:45 ColorfulClouds' advantage lies in algorithms and innovation — going all the way on that strength
23:04 The practical challenges faced by idealistic entrepreneurs
31:01 AI has unlocked many new capabilities, but retaining users is the critical question
34:06 Is AI-powered story creation a universal need?
36:45 The goal of having AI create works on par with human masterpieces like The Three-Body Problem — how far away is it?
[Excerpts from the podcast]
Xingyuan Yuan: Hi everyone, I'm Xingyuan Yuan. We were working on AI products well before this current wave. Computer vision was dominant back then, so I found an entry point — using computer vision to identify weather satellite and radar images, enabling minute-by-minute, kilometer-scale high-precision forecasting. We were the first to launch high-precision weather forecasting software in China.
In 2017, we敏锐地 realized the main battlefield had shifted from computer vision to text understanding. Text is highly compressed data, and as computing power improved, increasingly complex data could be decoded. The computer science community began working on NLP, so we migrated there too, building ColorfulClouds Translate — a machine translation product with millions of monthly active users.
After working on translation, we noticed much of the demand came from people reading novels. I thought, if demand is this strong, why not explore automated novel creation? In 2021, we launched ColorfulClouds Xiaomeng, along with a natural language programming methodology that we patented. By then, we already recognized that NLP technology could be applied to all sorts of things — essentially the same思路 that later became ChatGPT.
The last time I was on 5Y Tavern, we had just released Xiaomeng 2.0, a voice conversation scenario with some role-playing content ("Consistent Effort, an Unslackened Life" | 5Y Tavern x Xingyuan Yuan, ColorfulClouds Technology). Later, with products like ChatGPT emerging, we asked ourselves: in an era where everyone is doing AI, how can we still differentiate ourselves? We identified two directions. One was foundational research on model architecture, to build something more performant than Transformer. We eventually developed DCFormer, which achieves double Transformer's performance. This earned us an invitation to speak at an international academic conference in Vienna this year — one of only two Chinese companies invited.
The other direction: we recently released a new model that maximizes large model effectiveness through deep thinking and agent-based workflows, enabling more impressive capabilities. OpenAI applied this to math and coding, where iteration is relatively straightforward. We're applying it to novel creation, using this methodology to make story writing better.
5Y Tavern: You've previously mentioned that after this wave hit, the industry split into three paths. Ordinary youth chose to stack compute and data. Artistic youth chose building agent workflows, retrieval augmentation, prompt engineering — treating Transformer as a black box without needing to understand its internals. Then there was the "二B youth" path: opening the black box and studying the internal structure of the Transformer building block. You ultimately chose the third path. What considerations went into that? What pressures and difficulties did you face?
Xingyuan Yuan: The pressure on the company was immense at that time. Originally, only a handful of companies were in this space. We had briefly hit a million downloads in a week, thirty million daily dialogues, and hundreds of millions of views on Bilibili. User-generated content was incredibly vibrant — it felt hugely rewarding. But then everyone piled in. It was like we'd been exploring a narrow mountain trail, only to find an army charging through and leaving us far behind.
We agonized over whether to join the crowd in competing on the same track. In the end, I suppose it was pride — I felt that competing on data wouldn't play to our strengths. We're fundamentally an algorithm company at our core. If we abandoned our core algorithm work to focus on product optimization or chasing data volume, those aren't things we're good at. I believe our pursuit of intelligence itself, and our deep understanding of the fundamental difficulty of this problem, are what matter. Multimodal technology is relatively achievable, but improving intelligence is far harder. The key is how to maximize intelligence per unit of compute. Furthermore, once intelligence improves, how to effectively stack intelligence across different stages is another core question. Only by achieving this — getting more intelligence from the same electrical input — can you truly succeed in the market.
In story creation, you'll find that relying solely on product polish is a difficult path. Ultimately, it comes down to competing on intelligence. And many products' intelligence levels are genuinely unsatisfying, so user retention is extremely low. Everyone is just trying it out. Even if a hundred million people try it, with 0.1% retention, you're left with just a hundred thousand. Then another filter the following year, and you may have no users left. That is to say, despite all your marketing, having a hundred million users was just a bubble. More and more people are seeing this now, so I believe we need to focus on internal strength — a reflection on our previous pursuit of flashy tricks.
5Y Tavern: From when you decided to go all-in on model research and development to achieving truly critical results, how long did it take? Were there particularly difficult or bottleneck moments?
Xingyuan Yuan: We started NLP research very early, dating back to 2016. But the conversational path wasn't viable then. We had tried search-based Q&A, but the end product was still translation. In 2017, we developed ColorfulClouds Translate. The Transformer model didn't exist yet, so we'd always been developing our own models.
Around 2017, Transformer emerged, and "Attention Is All You Need" was published that fall. We had expected a new model roughly every year, and we could simply follow the trend. But from 2017 to 2023, nothing better than Transformer appeared. Google published papers claiming other models were essentially noise, that only Transformer performed best. Even so-called effective improvements only achieved 20-30% gains, which was already considered excellent. It seemed to have become like the 100-meter sprint world record — unable to break the 10-second barrier. Records had kept falling, but after 2017, progress seemed to stall. By 2019, two years on, the record still stood. We began wondering if this model was truly that exceptional. So we launched a research program on model interpretability, exploring the internal mechanics of how it worked and why.
You may now see articles explaining how Transformer works. But when you dig into each question — how it copies and learns patterns, what happens with each connection at each layer, each inference step — you'll find some seemingly simple questions are actually unexplainable. So from 2019 to 2022, we worked on model interpretability while developing Xiaomeng's large model. As research deepened, we found potential areas for improvement.
By 2023, with Transformer established, we faced a choice: follow the trend and build a version, or dismantle the large model's core and continue researching to build something better. We chose to open the black box, fundamentally out of pride. But in practice, we found that while Xiaomeng had succeeded, that model was small — perhaps hundreds of millions or a billion parameters. At larger scales of tens or hundreds of billions, this approach might not work. When scaled, it slowed down. The reason for slower speed: you're using a non-standard model, and much hardware acceleration isn't customized for you. Transformer had been around so long that it enjoyed extensive hardware acceleration support. Did we need to build our own hardware acceleration too? At this point, you realize this isn't merely a research problem but a pure engineering one. Without solving this engineering problem, you can't prove your performance is better — so we had to do it. Along the way, you question whether your good ideas are actually feasible in real-world scenarios. There are practical issues too, like securing enough servers for high compute, and so on.
But the hardest part is the emotional rollercoaster — doubting whether the entire approach is correct. You see others building so many models while you've produced nothing. It's like being a top student who suddenly ranks last in class. You say, don't panic, I'm working on something different that I believe will work, but you can't prove it. Even today, you still can't. But eventually, we found that at hundred-billion parameter scale, our model performed better than at ten-billion scale, and performance gains increased with parameter size. You realize this can actually succeed. That earned us the ICML presentation opportunity, high scores from the academic community, and some genuine recognition.
But even with this recognition, you're still an unknown company in this race. Because what people want isn't just good model architecture — they want you to release an actually usable product that people can try and genuinely find good. Training efficiency and intelligence levels from model architecture — no one cares about those.Our goal isn't to build the world's best usable large model, but to create the best model architecture, so our products differ from mainstream ones. You still feel a kind of loneliness, but it's fine — you just have to rely on conviction and keep going. There are highlight moments and difficult moments. That's our journey.
5Y Tavern: You chose the harder path. Along the way, did you encounter opposition?
Xingyuan Yuan: Constantly. Internally, team members questioned why we were developing large models when we could just use others'. Meta released LLaMA 3.1, 3.2, with over 500 billion parameters — why were we spending so much money, was it really meaningful? Some even suggested we abandon R&D entirely and become an AI product company, using others' algorithms directly. There were many other voices of incomprehension too — why not release a GPT-3.5 equivalent? You've invested so much time without producing something similar, is it a capability issue? These doubts were always present.
5Y Tavern: But you didn't change your thinking because of these doubts. ColorfulClouds' three products themselves accumulated many users and created genuine value, yet you remained committed to very foundational research, very firm on this path.
Xingyuan Yuan: I think you need to understand what kind of person you are, amplify your strengths, and make your weaknesses less important. My strength, for example, is algorithms and innovation — our entire company excels at this. Our products have high intelligence levels, with relatively simple products and business models. So I just need to do the technology well, put it on the app store, charge monthly — simplifying the marketing and business model aspects I'm less good at, while maximizing my research capabilities.
We focus on going all-out on one path, becoming more intelligent than others, which ultimately amplifies your capabilities while giving you life experience. You get to do what you love while receiving societal recognition — that's a happy state. In that situation, there's no need to worry about what others think, and ultimately you can achieve the greatest success. Of course, we haven't yet, but I'm deeply convinced — if we can provide the highest quality product in a given domain, we'll win users there, and keep snowballing. That's a good strategy.
5Y Tavern: Being able to invest significant time and energy into something you truly love and believe in is a blessing, but we imagine it's also very difficult. For ColorfulClouds, what do you see as the major practical challenges at this stage?
Xingyuan Yuan: There are many practical difficulties. In a fiercely competitive environment, talent you've carefully cultivated may be poached by other companies. Fundraising. The chip issue hangs over us like Damocles' sword — if we lose access to compute, the company could collapse. Then there's revenue. As the company grows, with more people to manage, whether our efficiency has actually improved is unclear. Also, during the pandemic when no one went out, ColorfulClouds Weather usage dropped and revenue was cut in half — that was very stressful.
5Y Tavern: Has Steven (Yunfeng Shi) discussed these issues with Xingyuan?
Yunfeng Shi: Running a company is always challenging, especially with ColorfulClouds on the front lines these past few years. But I think Xingyuan has strong survival instincts. As he said, he has strengths and weaknesses, and often he can use his innovation and algorithm strengths to compensate for challenges in weaker areas. For ColorfulClouds, if you can use your strengths to create a new category with capabilities others don't possess, and rely on that alone to reach positive commercial循环, that may be the ideal path.
Xingyuan's products always explode in popularity at launch — whether weather, translate, or Xiaomeng. Translate is a typical tool product; commercially speaking, it doesn't seem to have obvious moats. But at launch, it gained a million users in a month. And Xiaomeng finding a relatively unique direction in story creation today is also quite good.
Xingyuan Yuan: If product development is a process from 1 to 100, there's a point between 59 and 60 — at 59, user perception is zero; at 60, it suddenly jumps to 60. Users feel a dramatic improvement and enthusiastically spread it organically. Why did ColorfulClouds Translate get a million users? It was the first Chinese-English neural network simultaneous interpretation software. Previous translation quality might have been 20 or 30; it was 60. But in terms of user perception, other products were zero, and Translate suddenly hit 60 — flipping that switch.
Actually, both Translate and Xiaomeng made some mistakes. Their role at the "switch" moment was stunning, but retention wasn't as good. The demand exists, but it's not something users keep coming back to. This is a problem the entire industry currently faces — low retention rates — and we should learn from it. The solution is deep user understanding, even becoming the user yourself, asking why they stopped using it, identifying where problems occur. Sometimes we find these are problems we can solve, like intelligence levels — exactly our strength — so I'm quite confident.
Yunfeng Shi: Xingyuan just raised a critical issue, one that pervades the AI industry today. AI has unlocked many new capabilities, attracting masses of people. Acquiring new users isn't the problem. The real challenge is retaining them for long-term, stable product use. However, today most AI products' closed loops can't be completed by one person. Operations or product managers may identify problems through user conversations but not know an algorithm exists to solve them. Most algorithm experts may be unaware of user problems. In my view, this is ColorfulClouds' greatest advantage — having colleagues like Xingyuan who can directly ask users, find root causes, and solve them at the algorithm level.
I think the upside of so many people entering the industry is that solutions are being offered, and the smartest minds have once again converged on AI. I hope more people like Xingyuan emerge.
5Y Tavern: For users, is AI-powered story creation a universal need? Or is it more for a specific subset, like current writers?
Xingyuan Yuan: I believe this will become a universal need in the future. In the past, you could only read so many books; now you can read web novels — more choices, but still not personalized enough. Everyone has stories that touch their hearts, likely closely connected to their life experiences. From products like Character AI and Replica, we see people's strong desire to have their own creative worlds, to interact with others in them. Genshin Impact already exists, so why do people still want to chat with Genshin characters in Character AI or Xiaomeng? Because they want a character that's uniquely theirs, with unique interactive experiences. It's like when parents told us bedtime stories as children — made just for you. When you grew up, wouldn't you still want that? It's not that you don't want it; it's that parents lack time or creativity, depriving you of that experience. Now AI is like a new engine that can help recreate that experience.
5Y Tavern: You mentioned ColorfulClouds' goal is to create works on par with human masterpieces like The Three-Body Problem, not more mediocre content. How far do you think you are from this goal, or how much have you achieved? What are the remaining hurdles?
Xingyuan Yuan: First, we divide creation into two phases: inspiration and execution. For inspiration, I think we still have a long way to go. Inspiration is the process of extensive searching and filtering to discover valuable information. For execution, story creation splits into two schools: the architect school and the gardener school. Architect-school creators plan stories like designing a house — the final blueprint is set from the first brick. The Three-Body Problem is the quintessential architect work; Cixin Liu had the entire story's trajectory conceived from the opening, including the eventual dark forest theory.
Gardener-school creators are more like cultivating a garden — they shape characters and let them develop naturally, not fully预知ing the garden's final form. These concepts come from George R.R. Martin, author of A Game of Thrones, who said he tended toward the gardener approach when writing it — placing complex characters and letting them grow naturally into a complex novel.
In fact, AI performs exceptionally well at gardener-style writing. It can simulate how each character behaves in various states, even surpassing humans in this regard. For example, if you start with over 100 characters with complex relationships that evolve as the story develops, you can immerse yourself, play a role, and observe what that character experiences — a fascinating experience.
I believe through this gardener-style approach, AI could absolutely reach or even surpass human masters' levels. After all, AI's compute can easily handle complex relationships among 1,000 parallel characters. Its current deficiencies are insufficient intelligence and excessive cost. I believe within about half a year, I can solve all these problems and fully present a gardener-style work.
5Y Tavern: Though many can write science fiction, The Three-Body Problem remains unique. Is this related to certain qualities of the writer himself — whether so-called talent or his distinctive style — that enabled such creation? Can AI replicate this human quality?
Xingyuan Yuan: I think the most scarce thing is coming up with the initial inspiration. "I want three suns" — and what happens in a world with three suns? That's what matters most.
Novel creation divides into two parts: inspiration and technique. Technique concerns character development and story arc management, including climaxes and twists — things most authors can achieve, not so mysterious. Cixin Liu scored highly on exams to enter North China Electric Power University, then was assigned to a power station. There, he mostly monitored data to ensure smooth operation. Since most of the time everything was fine, he wrote during idle nights. He first tried using computers to write poetry, but found the results unsatisfying, so decided to do it himself. He started with short stories, progressed to medium-length The Wandering Earth, then Ball Lightning, and finally The Three-Body Problem. He too went through a learning and growth process, continuously interacting with society and proposing ideas. Many ideas in The Three-Body Problem appeared in his earlier works — a culmination of his creative career. So to call it something machines can never achieve, or the holy grail of human civilization, may be overstating it.
I believe AI can undergo a similar process, starting from short stories and gradually progressing to medium and long forms — a journey of growth. And Chinese science fiction isn't only Cixin Liu; there are Zhao Haihong, Wang Jinkang, Hao Jingfang, magazines, author interviews, exchange events — organizations and activities collectively advancing science fiction literature. AI literature's development may follow a similar process of continuous advancement.
5Y Tavern: What are ColorfulClouds' next plans or goals?
Xingyuan Yuan: First, I want to focus on gardener-style writing, striving to make AI not just a novelty product but a sustainably running tool — a challenge the entire industry faces, and one we're working to solve. Story creation is universal demand, yet insufficiently met. We're now trying to address it with methods we excel at. Within the next year or two, I hope our product can attract at least a million users, becoming a community product, a vibrant content platform where users share their works, experience AI creation, and platform and authors share revenue together, forming a healthy community ecosystem.
In the future, if we do well in story creation, we can explore architect-style writing methods and inspiration creation. Even if we spend our whole lives focused on just this one thing, that would be wonderful, because it's a process of constantly creating new worlds. So we don't need to consider concepts like lightspeed spacecraft or time machines for now — we should focus on doing this work well first. And once story creation is perfected, time machines become unnecessary, because you can travel through worlds of your own creation, returning to any era you wish.
5Y Tavern: When did your passion for story creation first begin?
Xingyuan Yuan: I've been reading science fiction since childhood. In high school, I tried writing some campus documentary fiction and science fiction, submitting to Science Fiction World. In 2007, I attended the World Science Fiction Convention in Chengdu as a college student, exchanging with Cixin Liu.
This was an unrealized dream from my school days. In 2019, I began collaborating with China Literature on machine translation projects. To understand what I was translating, I started reading web novels again and discovered many stories were genuinely excellent. People may have once considered reading web novels decadent, but now we find web literature has become a calling card of Chinese culture. And I was surprised to find that using these Chinese web novels as AI training data worked better than English data, because Chinese web novels are vast in quantity and much longer in length — quite interesting.
So I began reading novels while writing novels, while also developing an AI novel engine. I personally debugged and created at least 10 million words of content. If each creation is like a time travel, I later counted — I traveled over 1,700 times, each to explore how AI writes and how to adjust the model. Through this process, I never felt bored, or rather overcame my own boredom. I believe this is something truly valuable, not just a flash in the pan — I've determined it's something I can do for my entire life.




5Y Capital seeks out, supports, and inspires lonely entrepreneurs, providing support from spirit to all operational matters. We believe that if the "crazy" you in others' eyes begins to be believed in, the world will become a different place.
BEIJING · SHANGHAI · SHENZHEN · HONG KONG
