o1
OpenAI o1
OpenAI o1 is a reasoning-focused model series released on September 13, 2024, built on reinforcement learning rather than the GPT architecture — OpenAI's first product of its "Strawberry" project . Its core innovation is test-time compute scaling: the model spends seconds to minutes internally reasoning through problems via a hidden chain-of-thought process, testing approaches and correcting mistakes before answering . This yields dramatic gains in STEM domains — on the IMO qualifying exam, o1 scored 83% versus GPT-4o's 13% — though it lacks GPT-4o's browsing, file upload, and image processing capabilities . The release included o1-preview and a smaller o1-mini, with API access initially restricted to tier-5 developers . OpenAI researchers framed this as a paradigm shift from scaling training compute to scaling inference-time reasoning, with Greg Brockman calling it "System 2 thinking" unlocked through reinforcement learning-trained chain-of-thought .
AI-generated — may contain errors, please verify.
Coverage
Moonshot AI Releases Visual Reasoning Model k1, Outperforming OpenAI o1 in STEM Benchmarks | Z News
Every pixel deserves deep thought.
真格基金·30,000-Word Transcript: A Google DeepMind Researcher on Deconstructing OpenAI o1 and the LLM+RL Paradigm | Z Talk
The most hardcore, no-fluff technical breakdown of o1.
真格基金·Yichao Ji, Peak: A Small Step Toward Replicating OpenAI o1 — Steiner Open-Source Model Progress Report | Z Talk
Since OpenAI released o1, I've been working on reproducing it as a side project in my spare time.
真格基金·Moonshot AI Founder Zhilin Yang's Latest Take: Deep Reflections on OpenAI's o1 Paradigm Shift | Z Talk
The Next Phase of Foundation Models: A New Paradigm?
真格基金·The Pros and Cons of OpenAI's New o1 Model, Explained | Yunqi Tech π --- On September 13, Beijing time, OpenAI released its long-awaited new model series, o1-preview and o1-mini — the first fruits of its "Strawberry" project. This represents a significant departure from the GPT series, with o1 built on an entirely new training paradigm. ## What Makes o1 Different? The core innovation is **test-time compute scaling** — essentially, giving the model more time to "think" before responding. o1 spends seconds to minutes internally reasoning through problems, testing different approaches, and correcting its own mistakes before delivering an answer. This mimics how humans tackle complex problems: we don't always blurt out the first thing that comes to mind. We pause, consider alternatives, backtrack when stuck. o1 does something analogous, using a **chain-of-thought** process that's hidden from the user. ## Where o1 Shines **STEM domains, particularly math and coding.** OpenAI's benchmarks show dramatic gains: - **International Mathematical Olympiad (IMO) qualifying exam**: GPT-4o solved 13% of problems; o1 reached **83%** - **Codeforces competitive programming**:
Where's the ceiling for large language models?
云启资本·




