XiaomiMiMo/MiMo-V2-Flash
TEXT GENERATIONConcurrency Cost:4Model Size:310BQuant:FP8Ctx Length:32kPublished:Dec 16, 2025License:mitArchitecture:Transformer0.7K Open Weights Warm

MiMo-V2-Flash by XiaomiMiMo is a 309B total parameter Mixture-of-Experts (MoE) language model with 15B active parameters, designed for high-speed reasoning and agentic workflows. It features a novel hybrid attention architecture and Multi-Token Prediction (MTP) for efficient inference and long-context handling up to 256k tokens. The model excels in complex reasoning tasks and agentic capabilities, including code generation and web development, achieved through advanced post-training techniques like Multi-Teacher On-Policy Distillation (MOPD) and large-scale agentic RL.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p