unsloth/gemma-4-26B-A4B-it
The unsloth/gemma-4-26B-A4B-it is a 25.2 billion parameter instruction-tuned multimodal Mixture-of-Experts (MoE) model from Google DeepMind, part of the Gemma 4 family, with a 256K token context window. It is optimized for efficient inference by activating only 3.8 billion parameters, making it suitable for tasks requiring strong reasoning, coding, and multimodal understanding across text and image inputs. This model excels in agentic workflows and offers native function-calling support.
Loading preview...
Overview
This model, unsloth/gemma-4-26B-A4B-it, is an instruction-tuned variant of Google DeepMind's Gemma 4 family. It is a 25.2 billion parameter Mixture-of-Experts (MoE) model, but uniquely, it activates only 3.8 billion parameters during inference. This "active parameters" approach allows it to run significantly faster, almost like a 4B model, while retaining the capabilities of a larger model. It features a substantial 256K token context window and supports multimodal inputs, specifically text and images.
Key Capabilities
- Efficient Inference: Achieves fast inference speeds by activating only a subset of its total parameters.
- Multimodality: Processes both text and image inputs, with variable aspect ratio and resolution support.
- Reasoning: Designed with strong reasoning capabilities and configurable thinking modes.
- Extended Context: Supports a 256K token context window for complex, long-context tasks.
- Enhanced Coding & Agentic Capabilities: Shows improved performance in coding benchmarks and includes native function-calling support for autonomous agents.
- Native System Prompt Support: Offers structured and controllable conversations through native
systemrole support.
Good For
- Applications requiring fast inference without sacrificing performance on complex tasks.
- Multimodal applications that combine text and image understanding, such as document parsing, UI understanding, and OCR.
- Agentic workflows and tools that benefit from structured function calling.
- Coding tasks including generation, completion, and correction.
- Use cases demanding long context understanding and advanced reasoning.