HeyGen isn't for humans.

葬AI葬AI·June 1, 2026

A reckless lurch backward into history

"Reversing Full Speed into the Past"

Most people haven't heard of HeyGen. In one sentence: HeyGen is the Manus of an alternate timeline—the video agent version.

Both of HeyGen's founders are Chinese, and they too triggered a mass employee migration. But HeyGen's globalization has been relatively successful so far. Its only real connection to China's internet is basically that AI viral video of Taylor Swift speaking Chinese.

It may seem unremarkable now, but back in 2023—when AI video tools generated output that basically amounted to sleep-talking—this clip, with its smooth movements and perfectly synced lip movements, genuinely freaked everyone out.

So they doubled down on the deepfake track. Visit their website and it's nothing but digital human products. They basically don't do anything else.

Now this business has pushed HeyGen's ARR past $100 million. Terrifying.

I tried it out, made a video of a Russian beauty coming to China with no bride price demands. The UX and final product were decent.

I suspect HeyGen is pretty popular over in the "industrial parks." A lot of folks in Jiangxi have definitely been burned by this software.

But I'm not writing about it today because I suddenly got excited about digital humans. It's because HeyGen recently claimed to have developed some kind of open-source video product where you "just write code" to make videos—HyperFrames.

I thought, isn't writing code harder than editing video? What's the logic behind using something harder to accomplish something relatively easier?

The product emphasizes that it's for AI agents, not humans

I looked into it. This "write code to make videos" thing is actually a newly hyped concept from the past couple months called Vibe Motion.

The logic behind Vibe Motion hype goes like this:

AI video relies on Diffusion, feeling its way across the river. Not only is the randomness a total gacha-fest, it's computationally expensive.

But AI is naturally good at writing front-end code, and front-end can achieve lots of motion effects. So just have AI make motion effects and videos by writing HTML code.

And the relationship between HTML code and motion effects is deterministic—easy to modify, no more daily gacha pulls.

A beautiful vision.

Actually, there was already an open-source project called Remotion doing this back in 2021. The main difference is HyperFrames writes HTML while Remotion is React-based, but the concept has been around.

The reason this got reheated in the past couple months is probably because Higgsfield—a famous US AI video unicorn (think America's TapNow)—launched a product called Higgsfield Vibe Motion, commercializing what open-source community folks had been doing for the public good. HeyGen saw this and got hungry, rushing to follow suit.

So I tested these Vibe Motion products (mainly HyperFrames) by making a few videos. I ultimately found their self-positioning quite clear: they genuinely aren't for human use.

First, to use HyperFrames, you need an existing AI agent like Codex or Claude Code. (I used Claude Code.)

Then following the official instructions, you input a series of commands to have your AI agent equip itself with the HyperFrames Skill.

After installation, you can start generating videos. I tried the usual routine—asked it to make a promotional video for Doubao.

The result scared me.

Actually the fonts, motion effects, and color scheme were all decent. But the TTS voiceover was like the whispers of some ancient god.

I thought Doubao had been possessed by Sun Xiaochuan and was speaking in reverse.

I asked about it—apparently it was a bug, though I didn't understand the explanation either.

Unclear whether non-functional Chinese voiceover represents another effort in HeyGen's globalization

To test its claimed accuracy, I batch-sent it instructions in Claude Code's dialog box, having HyperFrames modify each shot.

And changed the voiceover to English. The result:

I gotta say, it did follow the requirements. Though the added sound effects all felt rather perfunctory.

Beyond conversing with an AI agent to modify videos, HyperFrames offers another modification method: they provide a Studio where you can select elements to modify fonts, colors, sizes, and motion effects.

The Studio also has a built-in "Ask agent" feature that packages detailed element information into a prompt, making it easier to communicate requirements to your AI Agent.

The Studio's left side is literally the front-end page—if you know code, you can directly modify it.

Sounds great. But in reality this Studio is extremely unstable. After a few edits it black-screens or errors out, and they're all code issues I don't understand.

What I'm saying is, if I could understand these, would I be using your product to do generative AI video work?

Later I sent the same requirements to Remotion and Higgsfield Vibe Motion.

Remotion result:

Higgsfield Vibe Motion result:

Remotion has better aesthetics; Higgsfield Vibe Motion has a friendlier interface. But the outputs are roughly the same. Based on my limited front-end knowledge, the motion effects they can achieve are interchangeable.

Then I wanted to test dynamic chart generation. Previously I'd have to go to pirated sites to download After Effects templates.

So I had HyperFrames make a video explaining Beijing housing price trends.

The result:

Voice and visuals match up, statistical charts animate smoothly.

But it has zero entertainment value—just feels like a dynamic PowerPoint.

Looking at this video, I suddenly understood why HeyGen is doing Vibe Motion:

First, building a proper video model from scratch costs too much. Vibe Motion is essentially just writing a Skill—no training needed—and fills a product gap.

Second, HeyGen already does digital humans. When digital humans do talking-head videos, whether explaining knowledge or analyzing events, having a dynamic PowerPoint in the background actually makes sense.

So I sent my previous article "The AI Hype Bible: Shocking Premiere" to Claude, generated a script, used HeyGen to generate a digital human, then used HyperFrames to generate video for the background.

Disclaimer: The video in the bottom-right corner is me—movements and voice are directly generated by HeyGen, which even simulated my voice.

This product feels perfect for Yu Hao. With this, his homepage wouldn't need to be all identical faces as thumbnails.

But actually it's not that suitable for Yu Hao either. My Claude Plus plan's compute basically only lasts for 2-3 videos per 5-hour window. In 5 hours, Yu Hao could record 500 videos with his camera on.

Then I thought, since it's writing HTML, theoretically it can draw vector graphics, theoretically it can do simple animation.

So I had Claude write a detailed script and had HyperFrames generate a South Park-style story about the "Sam's Club Top Scorer." Main plot: Shandong opened a Sam's Club, someone raced to be first through the door—the "Sam's Club Top Scorer"—only to discover everyone else was taking the civil service exam that day. The person broke down.

Based on a true recent event

The result:

Actually, as a purely HTML-generated video, the effect is already quite good.

But we have to admit: when judged within the category of "video," the conclusion is—this is garbage.

Including the earlier promotional video and data journalism short—they're quite competent as HTML motion effects, but as videos, they don't belong on the table.

Ultimately, Vibe Motion products in this wave are simply not what the market needs from video agent products.

Vibe Motion developers say: we used AI to make text move. By definition, moving images are video. So we're AI video.

Isn't this pure self-indulgence?

At the end of the day, go scroll Douyin or TikTok—how many viral videos bear any resemblance to what you're generating?

From another angle, who's watching short videos to see screen recordings of your HTML-generated dynamic web pages?

Right? Like you can't go around promoting that you made a film, then reveal it's actually a clip of Arrival of a Train. Same logic.

Products need to keep up with the times, or they're reversing into the past—with terrible consequences.

Moreover, the logic behind inventing Vibe Motion is itself problematic:

"Since AI is good at writing HTML, let's have AI make videos by writing HTML"—this doesn't start from user needs at all. It's from the developer's perspective, completely bass-ackwards.

Choosing development paths based on convenience rather than usefulness is far too easy for product managers.

By this logic, you might as well say large language models are great at answering questions in text, so we should develop a product that records Chatbot dialog boxes. And since this involves AI, it's also an AI video agent.

Very funny.

Finally, as a human, using various Vibe Motion products—especially HyperFrames—I genuinely felt this wasn't made for humans.

Setting aside the endless bugs. In multiple scenarios—downloading software, exporting videos—I was required to open my Mac's Terminal and input code to execute tasks.

And it frequently lectured me on coding knowledge in six different languages.

You technical folks may find this routine, but for me it was genuinely psychologically terrifying—recalling that night I followed an online guide to deploy a lobster, which ended in a black screen and dead computer, forcing me to go to the Apple Store for repairs 🤡

HyperFrames is quite sly, preemptively stating their product is for Agents. The official site is full of code, documentation full of incomprehensible jargon. The central message seems to be: noobs get lost, we only welcome Detroit: Become Human characters and geeks.

Seriously, including HyperFrames, many AI products today are somewhat like cosplaying Japanese sushi masters—mystifying the entire dining experience by having apprentices wash dishes and steam rice for ten years before starting. Diners like us can't understand. Asking makes us look like uncultured rubes, subject to mockery by those in the know. Either way, it's our fault.

Honestly, as consumers of AI products, we shouldn't have to endure this much. If you developers can't make products that a three-year-old can understand, just delete your GitHub accounts and go back to farming. Thanks ❤️

By the way, you can subscribe to funeralai.substack.com 💓—readable by any actual human.

(This article's cover image was generated by ChatGPT; purely human-written text)