moonshotai/Kimi-K2-Instruct

5.0 based on 1 review
Warm
Public
1000B
FP8
32768
License: modified-mit
Hugging Face
Overview

Kimi-K2-Instruct: An Agentic MoE Language Model

Kimi-K2-Instruct is a powerful 1 trillion parameter Mixture-of-Experts (MoE) language model from Moonshot AI, utilizing 32 billion activated parameters. Pre-trained on 15.5 trillion tokens with the novel Muon optimizer, this model is specifically engineered for advanced agentic capabilities, including sophisticated tool use, complex reasoning, and autonomous problem-solving.

Key Capabilities

  • Agentic Intelligence: Designed for robust tool use and autonomous problem-solving workflows.
  • Large-Scale MoE Architecture: Features 1 trillion total parameters with 32 billion activated, enabling high performance while managing computational efficiency.
  • Optimized Training: Leverages the MuonClip Optimizer to ensure stability and performance during large-scale pre-training.
  • Strong Performance: Achieves competitive results across various benchmarks, particularly in coding tasks (e.g., 53.7% on LiveCodeBench v6, 65.8% on SWE-bench Verified Agentic Coding) and tool use (e.g., 70.6% on Tau2 retail).
  • Extensive Context Window: Supports a 128K token context length, facilitating complex and long-form interactions.

Good For

  • Agentic Applications: Ideal for scenarios requiring advanced tool integration and autonomous decision-making.
  • Coding and Development: Excels in code generation and problem-solving, outperforming many models on coding benchmarks.
  • General-Purpose Chat: Provides a robust foundation for instruction-following and conversational AI experiences.
  • Complex Reasoning Tasks: Demonstrates strong capabilities in mathematical and logical reasoning challenges.