You call the new DeepSeek R1 a minor update? R2 is coming.

葬AI·May 28, 2025

The universe cools into an eternal winter as heat death claims all.

"R2, on the way"

With zero warning, DeepSeek today dropped a surprise update to its strongest reasoning model, R1.

It's been four months since R1's release.

Now, top AI companies are pushing major version updates every quarter. Advanced models like Gemini 2.5 Pro have already surpassed DeepSeek R1 across the board.

The strongest open-source model, Qwen 3, also launched at the end of April with performance roughly on par with R1.

DeepSeek R1's biggest advantage is being open source — both Tencent and Baidu have integrated it. Its biggest contribution was being the first to publicly validate that chain-of-thought reasoning and reinforcement learning represent a new technical path for boosting large model capabilities.

But R1 has persistent problems: a high hallucination rate, a tendency to fabricate data, and the need for massive data integration to keep hallucinations in check. Its writing style is erratic, prone to piling on flashy concepts like "quantum mechanics," making it poorly suited for long-form writing.

I prepared four tests across writing and coding to see how the new DeepSeek R1 compares to the old version, and how it stacks up against other mainstream models.

Major Writing Improvements

For the first test, I had both versions of R1 write an ultra-short Three-Body Problem fanfiction piece. The difference was obvious: the new DeepSeek R1's style has become more plainspoken, cautious, and accessible. Honestly, I could barely understand what the old version was trying to say.

Prompt: Write an ultra-short Three-Body Problem fanfiction about Cheng Xin's life after the universe's Big Crunch, in one paragraph, no more than 150 words.

New DeepSeek R1

The universe had frozen into eternal night through heat death. Cheng Xin huddled on a lone rock at the edge of a white hole.

At this end of the world, she futilely gathered the final specks of stellar debris that the white hole ejected — blue-shifted into dim, ghostly points of light. Just as she had once carefully sealed and dispatched Yun Tianming's brain slice to the stars, now she cupped these fragments of light to her chest, as if trying to reignite the universe's frozen heart with their faint warmth.

Old DeepSeek R1

Cheng Xin curled within the initial crystalline lattice of the new universe, still clutching the fish-shaped vacuum decay detector. The Returner's mathematical template flickered across her retinas as the parameters of a universe from 13.7 billion years ago recombined through supermembrane oscillations, while her memory banks drifted with countless blue planets swallowed by dual-vector foils. When the first rays of genesis radiation pierced her translucent body, she suddenly swapped the values of the strong and weak nuclear forces — newborn stars bloomed into rose nebulae across expanding spacetime. This would be the first universe to permit regret. At the tail of the detector, where curvature ripples lingered, three pale golden sophons quietly grew.

The ashes of the universe crystallized in her palm, as if answering a wait that would never end.

The contrast is stark. The new R1 opens with "The universe had frozen into eternal night through heat death. Cheng Xin huddled on a lone rock at the edge of a white hole" — clean and forceful, free of the old version's concept-stacking habit.

In the second test, I had the new R1, Claude, and other models rewrite the first article from the Burying AI newsletter, comparing just the opening paragraphs. On concision alone, the new DeepSeek R1 ranked first — its language was the most refined, actively stripping away nearly all redundant information.

Prompt: Great, now write the full article following this outline. Use fluent, natural language. No bullet points. Preserve my spoken voice, keep some of my original phrasing. Don't overuse adjectives and adverbs. Express meaning through simple verbs and nouns as much as possible.

New DeepSeek R1

Let's cut to the chase: those AI products flooding the market — "one sentence generates an article," "one sentence generates a video," "one sentence generates a podcast" — mostly have no real value. They're just manufacturing internet garbage. The reason is simple: they completely bypass the most central, most difficult part of content creation — information input.

Claude Sonnet 4

Stop building "one sentence generates an article, one sentence generates a video, one sentence generates a podcast" products. They have no value. They're just creating internet garbage. The reason is simple: these products don't solve the information input problem at all. If you only input one sentence, how can you generate high-quality content? Where does the information come from?

Gemini 2.5 Pro

There are many AI products on the market now claiming to "generate an article in one sentence, generate a video in one sentence, generate a podcast in one sentence." Honestly, in my view these products have no value. They're just creating internet garbage. The reason is simple: they completely fail to solve the information input problem. If a user only inputs one sentence, how can the AI generate high-quality content? Where does the information come from?

Qwen 3-235B

I've been thinking about something lately. All those AI products claiming "one sentence generates an article, podcast, or video" actually have no value. They're just creating internet garbage. Why do I say this? Because these products don't solve the information input problem at all. If a user only inputs one sentence, where can the AI extract enough unique information to generate high-quality content? It can only scrape from existing generic internet material, ultimately stitching together a pile of clichés.

But DeepSeek R1 has a serious problem: it didn't follow my instruction to "use fluent, natural language, no bullet points." The entire article was packed with bullet points — almost unusable.

Programming Aesthetic Gets an Upgrade

The first coding test had several models build a Big Mac 3D model. Claude Sonnet 4 clearly performed best, with the new DeepSeek R1 placing second. The new R1's 3D 🍔 even surpassed Gemini 2.5 Pro Preview, which had launched just 22 days earlier.

Prompt: Use Python to create a Big Mac 3D model, ensure it can run directly.

The final test was the classic ball physics simulation: 20 balls dropping one by one into a rotating hexagonal container, never passing through the walls.

The new DeepSeek R1 achieved high completion — the balls indeed didn't clip through the container. Its main issue was that all balls dropped at once rather than appearing sequentially. R1 even outperformed Claude Sonnet 4, whose generated balls mostly passed through the container walls.

Gemini 2.5 Pro performed best on this test with no obvious issues.

Prompt: Using Python, simulate 20 small balls bouncing inside a rotating hexagonal container. The 20 balls should be of different colors, continuously appearing and falling from the center of the rotating hexagonal container, but they should never pass through the container walls. Consider the effects of gravity and friction on the balls' motion, and implement realistic collision and rebound effects between the balls and between the balls and the rotating boundaries.

Each Model's Character

After months of heavy use across models, here's how they actually feel:

Gemini 2.5 Pro is like a STEM student with excellent logic but poor aesthetic sense. Claude Sonnet 4 is the class president excelling in both arts and sciences — top-tier at writing and coding, stable and instruction-following.

Qwen 3 doesn't clearly surpass DeepSeek R1 in any single area, but its overall capability is more stable than the old version. Dumber, but more stable. Its prose is more cautious and plain, more suited for actual work.

There's a popular verdict on old DeepSeek R1: smart but won't work. The new DeepSeek R1 is smarter, but even less obedient.

Over the past two months, I've used Gemini 2.5 Pro the most — nearly free, with context up to 2 million tokens, handling all my writing needs. But after Claude Sonnet 4 launched, I returned to Claude. It simply has the strongest writing ability.

Qwen 3 surprised me. Though lacking standout features, its clean and plain prose sometimes outperforms even Claude 3.7.

After this minor DeepSeek R1 update, word is that DeepSeek R2 is coming soon.

This makes sense. It's been four months since DeepSeek R1's release. Following the quarterly major-version cadence of top AI companies, R2 truly is on the way.

(Images in this article generated by ChatGPT-4o, with writing assistance from Claude Sonnet 4.)