Transformer
Transformer is the deep-learning architecture that has served as the de facto foundation of modern AI since its introduction in 2017. Its core mechanism is attention, which lets a model selectively weight the importance of different positions in a sequence when producing an output—essentially deciding "which word to look at" when processing language . In the view of Moonshot AI founder Zhilin Yang, what made Transformer uniquely powerful was its near-limitless scalability: unlike recurrent or convolutional networks, which tend to plateau, Transformer keeps improving as parameters and compute are added, enabling the emergence of general-purpose learning .
The architecture has since become the target of fundamental critique and overhaul. Researchers note that standard Transformer lacks readable-writeable memory and recursive mechanisms, making it brittle on complex, long-horizon reasoning tasks where intermediate states must be tracked . Others have zeroed in on its "egalitarian" residual connections—every layer's output weighted equally—as an efficiency bottleneck in very deep models, since early, important signals get diluted by later noise . In response, the field is actively exploring alternatives and patches: state-space models like Mamba, linear attention variants, and stack-based mechanisms that promise better state management ; Yunqi Capital's reporting also flags Google's "Nested Learning" paradigm, which introduces multi-timescale updates inside Transformer-like blocks to mimic neural plasticity .
The name has taken on brand resonance beyond the technical layer. Yunqi Capital, for instance, explicitly named its youth-focused AI founder program "Y Transformers" as a dual nod to the architecture and to "transformation," signaling belief that the next generation of entrepreneurs will reshape the technology rather than merely adopt it .
AI-generated — may contain errors, please verify.
Coverage
Google Explores a New "Continuous Learning" Paradigm: Nested Learning, an AI "Perpetual Motion Machine"? | Yunqi Tech π
Can AI's "Amnesia" Be Cured?
云启资本·Opening the "Black Box," Building In-House Models, and a Chat About AI Entrepreneurship and Creation | 5Y Pub Vol. 22 with Yuan Xingyuan of ColorfulClouds Technology
How can we get AI to create works on par with *The Three-Body Problem*?
五源资本·Finally, Someone's Building a 3D Virtual Girlfriend | WAVES
Love is all you need.
暗涌Waves·A Scientist's Attempt to Become a Better CEO | WAVES
Starting another company.
暗涌Waves·Moonshot AI Founder Zhilin Yang's Latest Take: Deep Reflections on OpenAI's o1 Paradigm Shift | Z Talk
The Next Phase of Foundation Models: A New Paradigm?
真格基金·Ten Thousand-Word Conversation with Scale AI Founder Alex Wang: Why Data, Not Compute, Is the Biggest Bottleneck for Large Models|Z Talk
We've exhausted all the easily accessible data.
真格基金·Thinking is a mechanical process, AI are going to do it|5Y View
Game out the trajectory of AI with Nat Friedman and Daniel Gross.
五源资本·Heaven's Feel: The Root of AI and Gaming | 5Y View
Any sufficiently advanced technology is indistinguishable from magic.
五源资本·The Next Generation of Productivity Tools Is Here: How Should Entrepreneurs Embrace the AIGC Wave? | Ronghui Dialogue
"Every entrepreneur, developer, and creator should be paying attention to advances in AIGC."
高榕创投·ChatGPT is the talk of Silicon Valley, but the buzz is theirs.
"Some people laughed. Some people cried. The vast majority fell silent."
暗涌Waves·"Consistent effort, a life without slack" | 5Y Capital Tavern Vol.9 × Yuan Xingyuan of Colorful Clouds Technology
AI makes our lives better.
五源资本·










