Moonshot AI K2 Released, Aiming for New Open-Source Model SOTA

Monolith砺思资本·July 12, 2025·40·0

The world's first open-source, commercially-usable trillion-parameter model

MONOLITH Recommends:

Yesterday, Moonshot AI officially released and open-sourced Kimi K2.

Kimi K2 is a one-trillion-parameter Mixture-of-Experts (MoE) model. Based on benchmark results, it surpasses closed-source models like GPT-4.1 and Claude 4 Opus in autonomous programming, mathematical reasoning, and other performance metrics. It sets new state-of-the-art (SOTA) records among open-source models on three key benchmarks: SWE Bench Verified (programming), Tau2 (agentic tasks), and AceBench (tool use).

Last month, Moonshot AI launched its first AI agent, Kimi-Researcher, which scored 26.9% on Humanity's Last Exam (HLE) — a notoriously difficult benchmark — setting the highest score ever recorded on that test. The foundation model powering it is the very same Kimi K2 released here.

Moonshot AI's mission is to find the optimal way to convert energy into intelligence. On the path of continuously pushing the boundaries of intelligence, there are no smooth roads — only the relentless exploration and persistence of entrepreneurs. We look forward to seeing more emergence and creation in the future.

Kimi has officially released the Kimi K2 model and simultaneously open-sourced it.

Kimi K2 is an MoE-based foundation model with stronger coding capabilities and greater proficiency in general agent tasks, featuring 1 trillion total parameters and 32 billion active parameters.

On benchmarks including SWE Bench Verified, Tau2, and AceBench, Kimi K2 has achieved SOTA results among open-source models, demonstrating leading capabilities in coding, agentic tasks, and mathematical reasoning.

During pre-training, Kimi K2 used the MuonClip optimizer to enable stable and efficient training at the trillion-parameter scale. At a time when high-quality human data has become a bottleneck, it effectively improves token utilization efficiency and finds new scaling headroom.

Other key technical innovations include large-scale agentic tool-use data synthesis and general reinforcement learning with self-evaluation mechanisms. For more details, refer to Kimi's technical blog.

Starting today, visit kimi.com or download the Kimi app to experience the new Kimi K2 model. API services are also live, offering OpenAI- and Anthropic-compatible Chat API interfaces. You can easily switch your existing LLM tools to Kimi K2 and experience its powerful agent and coding capabilities.

Kimi K2 provides a solid foundation for building general agent capabilities, but general agents will require more advanced abilities such as reasoning and visual understanding. Kimi plans to add these capabilities to Kimi K2 in the future.

Kimi hopes that by fully open-sourcing a more powerful model, it can further accelerate the overall progress of AGI research and real-world application.

📈 Model Performance Improvements

Kimi K2 delivers excellent performance on benchmarks across three capability dimensions: agentic coding, tool use, and mathematical reasoning.

Beyond benchmark tests, Kimi K2 also demonstrates stronger generalization and practical utility in real-world scenarios:

Coding Capability Improvements

In front-end development tasks, Kimi K2 excels at generating code with strong design sensibility and visual impact, supporting particle systems, data visualizations, and 3D scenes, with robust graphics capabilities and interactivity.

Below is a 3D mountain canyon landscape generated by Kimi K2, supporting day-night cycles:

Prompt: Create a 3D HTML mountain scene with cliffs, rivers, and day-night lighting. Supports drag/zoom, animated transitions, realistic gradients, and toggleable contour lines...

Here is a particle-effect galaxy generated by Kimi K2:

Prompt: Create a 3D particle galaxy with swirling nebulas, dynamic lighting.

Here is a futures trading system generated by Kimi K2 in one shot. Without specific instructions, Kimi automatically selected TradingView and built a complete futures trading interface:

Prompt: Create a HTML!! an immersive browser-based futures trading simulator with professional-grade UI/UX using modern JavaScript libraries. Focus on real-time visualizations and interactive trading mechanics.

Agent Tool-Use Capability Improvements

Kimi K2 now has stable complex instruction parsing capabilities, automatically breaking down requirements into a series of well-formatted, directly executable ToolCall structures.

You can seamlessly integrate it into agent/coding frameworks such as owl, Cline, and RooCode to complete complex tasks or automated coding.

Agent capabilities are already available via API, with more tool capabilities coming soon to the Kimi platform. First, take a look at real-world demos from Kimi's internal testing environment to experience the power of a model with strong agentic capabilities:

For example, feed Kimi K2 130,000 rows of raw data, and it can analyze how remote work ratios affect salary, identify statistically significant differences, automatically generate statistical charts and regression model interpretations, and produce professional visualizations — violin plots, box plots, scatter plots — in a unified color scheme, compiling everything into a report.

Or, if you're a Coldplay fan, Kimi K2 can plan your year of fandom for you: mapping out concert cities, booking flights and hotels, planning travel itineraries, generating a calendar, summarizing the full schedule in HTML, and emailing it to you.

Stylized Writing Capability Improvements

In rewriting tasks, Kimi K2 can precisely control output style. Whether rewriting academic text in the voice of a middle schooler or mimicking Apple advertising copy, it preserves both the original meaning and the target style, demonstrating strong contextual retention and expressive transfer abilities.

In creative writing tasks, Kimi K2 generates text that pays more attention to detail and emotion, rather than speaking in vague abstractions.

When given a sci-fi writing challenge that once sparked widespread discussion — "What if the real world is actually an AI model?" — Kimi K2 produced a richly plotted, detail-filled science fiction story with passages that genuinely move the reader:

$2

$2

Here is the full text of the work Kimi K2 generated based on this premise:

Additionally, Kimi K2 shows improvements in general knowledge reasoning, mathematics, and planning tasks.

🌍 Open-Sourced at Launch

Kimi has simultaneously open-sourced two versions of the Kimi K2 series:

Kimi-K2-Base: The base pre-trained model without instruction fine-tuning, suitable for research and custom scenarios;
Kimi-K2-Instruct: The general instruction-tuned version (non-reasoning model), delivering excellent performance on most Q&A and agent tasks.

Model weights and fp8 weight files have been open-sourced on Hugging Face 👇

https://huggingface.co/moonshotai/Kimi-K2-Instruct

Additionally, inference engines including vLLM, SGLang, and ktransformers have added support, allowing you to deploy on your own servers for the same experience as the Kimi Open Platform API.

🧙 Technical Exploration

Kimi K2 uses the MuonClip optimizer to robustly support trillion-parameter model training, significantly improving token utilization efficiency. Combined with large-scale agentic data synthesis and general reinforcement learning, the model continues to advance in general intelligence capabilities.

MuonClip Optimizer: Kimi K2 abandoned the traditional Adam optimizer and innovatively adopted the Muon optimizer. To mitigate the problem of attention logits growing too large during large-scale training, Kimi proposed MuonClip and scaled it to the trillion-parameter level, improving training stability and token efficiency. Kimi K2 completed stable training on 15.5 trillion tokens with no loss spikes throughout.
Large-Scale Agentic Tool-Use Data Synthesis: Kimi built a synthetic pipeline capable of generating multi-turn tool-use scenarios at scale, covering hundreds of domains and thousands of tools. High-quality samples were filtered using LLM-based evaluation for training.
General Reinforcement Learning: Kimi K2 applies reinforcement learning not only to verifiable tasks (code, math) but also solves the reward sparsity problem for unverifiable tasks by introducing a self-judging mechanism. By continuously optimizing the critic on verifiable tasks, it improves performance on generalized tasks.

🧪 API and Pricing

Kimi K2 API services are now fully live, supporting contexts up to 128K tokens with greater generality and tool-use capabilities. Pricing is as follows:

Per million input tokens: 4 RMB
Per million output tokens: 16 RMB

Compatible with both OpenAI and Anthropic API formats, and works well with various frameworks. Additionally, the upgraded ToolCall capability strictly guarantees format correctness, making it suitable for complex agent tasks.

🚀 Try It Now

Visit kimi.com or download the Kimi app to start a conversation with the Kimi K2 model right away.