moonshotai/Kimi-K2.5

Warm
Public
1000B
FP8
32768
Jan 27, 2026
License: other
Hugging Face
Overview

Overview

Kimi K2.5 is a multimodal agentic model developed by Moonshot AI, built upon Kimi-K2-Base through continual pretraining on approximately 15 trillion mixed visual and text tokens. It features a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters and 32 billion activated parameters, supporting a 256K context length. The model seamlessly integrates vision and language understanding with advanced agentic capabilities, offering both instant and thinking modes.

Key Capabilities

  • Native Multimodality: Excels in visual knowledge, cross-modal reasoning, and agentic tool use, pre-trained on vision–language tokens.
  • Coding with Vision: Capable of generating code from visual specifications (e.g., UI designs, video workflows) and orchestrating tools for visual data processing.
  • Agent Swarm: Transitions to a self-directed, coordinated swarm-like execution scheme, decomposing complex tasks into parallel sub-tasks executed by dynamically instantiated, domain-specific agents.
  • High Performance: Demonstrates strong performance across reasoning, knowledge, image, video, coding, and long-context benchmarks, often comparable to or exceeding other large proprietary models, especially with tool augmentation and agent swarm capabilities.

Good for

  • Applications requiring deep visual understanding and reasoning.
  • Automated code generation from visual inputs.
  • Complex task decomposition and execution using agentic workflows.
  • Scenarios demanding long-context processing and multimodal input handling (images, videos, text).