Moonshot AI K2 Released, Setting a New Open-Source SOTA! Perplexity Opens Post-Training, Tops Hugging Face Trending | Gaorong Future

高榕创投·July 14, 2025·19·0

Open Source. Model as Agent.

Last Friday evening, Moonshot AI officially released the Kimi K2 model, simultaneously open-sourcing it.

Kimi K2 is a Mixture-of-Experts (MoE) foundation model with stronger coding capabilities and greater proficiency at general agent tasks, featuring 1 trillion total parameters and 32 billion active parameters.

In benchmark tests including SWE-Bench Verified, Tau-2, and AceBench, Kimi K2 achieved state-of-the-art results among open-source models, demonstrating leading capabilities in coding, agentic tasks, and mathematical reasoning.

The model has garnered positive reception across the industry. Perplexity CEO Aravind Srinivas noted that Kimi performed exceptionally well in internal evaluations, congratulating the team on building such an impressive model and announcing that Perplexity would soon begin post-training with it.

Hugging Face co-founder Thomas Wolf also expressed his admiration, observing that the continuous improvement of open-source models is mounting a serious challenge to the latest closed-weight models, and praising the Moonshot AI team for releasing a series of impressive models over the past several months.

Following its release, Kimi K2 quickly rose to the top of the Hugging Face trending leaderboard.

Just last month, Moonshot AI launched its first AI agent, Kimi-Researcher (Deep Research), which achieved a 26.9% Pass@1 accuracy rate on the challenging AI benchmark Humanity's Last Exam (HLE) — one of the highest known scores to date. The foundation for that agent is precisely the Kimi K2 model released here.

Kimi K2 provides a solid foundation for building general agent capabilities, though truly general agents will require more advanced abilities such as reasoning and visual understanding. Moonshot AI plans to add these capabilities to Kimi K2 in the future.

Moonshot AI stated, "We hope that by fully open-sourcing more powerful models, we can further accelerate the overall progress of AGI research and application deployment." Gaorong Ventures invested in Moonshot AI in 2024, and we look forward to continuing our long-term partnership with the company as it explores the optimal path for converting energy into intelligence.

Starting today, you can experience the new Kimi K2 model by visiting kimi.com or downloading the Kimi App. API services are also live, offering Chat API endpoints compatible with both OpenAI and Anthropic formats — making it easy to switch your existing LLM tools to Kimi K2 and experience its powerful agentic and coding capabilities.

Below are more notable details about Kimi K2, including model performance improvements, groundbreaking technical explorations, open-source release, API access, and pricing.

📈 Model Performance Improvements

Kimi K2 delivers strong results in benchmark tests across three key capability dimensions: agentic coding, tool use, and mathematical reasoning.

Beyond benchmark performance, Kimi K2 also demonstrates stronger generalization and practical utility in real-world scenarios:

Coding Capability Improvements

In front-end development tasks, Kimi K2 excels at generating code with strong design sensibility and visual impact, supporting particle systems, data visualizations, and 3D scenes with robust graphics capabilities and interactivity.

Here is a 3D mountain canyon landscape generated by Kimi K2, complete with day-night cycles:

Here is a particle-effect galaxy generated by Kimi K2:

Here is a futures trading system generated by Kimi K2 in one shot — without specific instructions, Kimi autonomously selected TradingView and built a complete futures trading interface:

Agent Tool Use Capability Improvements

Kimi K2 now features stable complex instruction parsing, automatically decomposing requirements into a series of format-standardized, directly executable ToolCall structures.

You can seamlessly integrate it into agentic and coding frameworks such as OWL, Cline, and RooCode to complete complex tasks or automated coding workflows.

Agent capabilities are already available via API, with more tool features coming soon to the Kimi platform. First, take a look at real-world demos from our internal testing environment to experience the appeal of a model with powerful agentic capabilities:

For example, feed Kimi K2 130,000 lines of raw data, and it can analyze how remote work ratios affect compensation, identify statistically significant differences, automatically generate statistical charts and regression model interpretations, and produce professional visualizations — violin plots, box plots, scatter plots — in a unified color scheme, compiling everything into a report.

Or, if you're a Coldplay fan, Kimi K2 can help plan your year of fandom: booking flights and hotels for concert cities, planning travel itineraries, generating a calendar, summarizing the full itinerary in HTML, and sending it to you via email.

Stylized Writing Capability Improvements

In rewriting tasks, Kimi K2 accurately controls output style — whether adapting scientific text into a middle-schooler's voice or mimicking Apple advertising copy — while preserving both original meaning and expressive style, demonstrating strong context retention and expressive transfer abilities.

In creative writing tasks, Kimi K2 generates text with greater attention to detail and emotion, moving beyond abstract generalizations.

When given a science fiction writing prompt that once sparked widespread discussion — "What if the real world is actually an AI model?" — Kimi K2 produced a richly plotted, meticulously detailed science fiction story with passages that prove genuinely moving:

$2

$2

Here is the full story generated by Kimi K2 based on this premise:

Additionally, Kimi K2 shows improvements in general knowledge reasoning, mathematics, and planning tasks.

🌍 Open-Sourced at Launch

This release simultaneously open-sources two versions of the Kimi K2 series:

Kimi-K2-Base: The base pre-trained model without instruction fine-tuning, suitable for research and custom scenarios;
Kimi-K2-Instruct: The general instruction-tuned version (non-reasoning model), delivering excellent performance on most Q&A and agent tasks.

Model weights and FP8 weight files have been open-sourced on Hugging Face 👇

https://huggingface.co/moonshotai/Kimi-K2-Instruct

Furthermore, inference engines including vLLM, SGLang, and ktransformers have added synchronous support, allowing you to deploy on your own servers for an experience equivalent to the Kimi Open Platform API.

🧙 Technical Explorations

Kimi K2 uses the MuonClip optimizer to stably support trillion-parameter model training, significantly improving token utilization efficiency. Combined with large-scale agentic data synthesis and general reinforcement learning, the model continues to advance in general intelligence capabilities.

MuonClip Optimizer: Kimi K2 abandons the traditional Adam optimizer in favor of the innovative Muon optimizer. To mitigate attention logit inflation issues at large scale, the team proposed MuonClip and scaled it to trillion-parameter scope, improving training stability and token efficiency. Kimi K2 completed smooth training across 15.5 trillion tokens with no loss spikes throughout.
Large-Scale Agentic Tool Use Data Synthesis: The team built a scalable synthetic pipeline for generating multi-turn tool-use scenarios, covering hundreds of domains and thousands of tools. High-quality samples were filtered through LLM-based evaluation for training use.
General Reinforcement Learning: Kimi K2 applies reinforcement learning not only to verifiable tasks (coding, mathematics) but also addresses the reward sparsity problem in non-verifiable tasks through a self-judging mechanism. The critic model is continuously refined through verifiable tasks to improve performance on generalized tasks.

🧪 API and Pricing

Kimi K2 API services are now fully live, supporting context lengths up to 128K with stronger generality and tool-calling capabilities. Pricing is as follows:

Input tokens: 4 RMB per million
Output tokens: 16 RMB per million

Compatible with both OpenAI and Anthropic API formats, and works well with various frameworks. Additionally, the upgraded ToolCall capability strictly guarantees format correctness, making it suitable for complex agent tasks.

🚀 Try It Now

Visit kimi.com or download the Kimi App to start a conversation with the Kimi K2 model right away.

$2*$2*