google/gemma-4-26B-A4B
Gemma-4-26B-A4B is a 25.2 billion total parameter multimodal Mixture-of-Experts (MoE) model developed by Google DeepMind, part of the Gemma 4 family. It features 3.8 billion active parameters for fast inference and a 256K token context window. This model excels at reasoning, coding, and multimodal understanding, processing text, images, and video inputs to generate text outputs.
Loading preview...
Gemma 4 26B A4B: Multimodal MoE for Reasoning and Coding
Google DeepMind's Gemma 4 26B A4B is a 25.2 billion total parameter Mixture-of-Experts (MoE) model, designed for efficient and powerful multimodal AI. It leverages 3.8 billion active parameters during inference, allowing it to run almost as fast as a 4B-parameter model while delivering performance comparable to larger models. This model supports a substantial 256K token context window and is proficient in over 140 languages.
Key Capabilities
- Multimodal Understanding: Processes text, images (with variable aspect ratio and resolution), and video inputs, generating text outputs. It supports interleaved multimodal input.
- Reasoning: Features configurable thinking modes for step-by-step problem-solving.
- Coding & Agentic Workflows: Achieves significant improvements in coding benchmarks and includes native function-calling support for autonomous agents.
- Native System Prompt Support: Enhances structured and controllable conversations.
- Long Context: Handles contexts up to 256K tokens.
Benchmark Highlights
- MMLU Pro: 82.6%
- AIME 2026 (no tools): 88.3%
- LiveCodeBench v6: 77.1%
- MMMU Pro (Vision): 73.8%
Good For
- Reasoning and complex problem-solving.
- Code generation, completion, and correction.
- Agentic workflows requiring structured tool use.
- Applications requiring multimodal input (text, image, video) for text generation.
- Deployments where fast inference is crucial, balancing performance with efficiency.