google/gemma-4-26B-A4B-it
The google/gemma-4-26B-A4B-it is a 25.2 billion parameter multimodal Mixture-of-Experts (MoE) model developed by Google DeepMind, part of the Gemma 4 family. It processes text and image inputs, generating text outputs, and features a 256K token context window. Optimized for reasoning, coding, and agentic capabilities, this instruction-tuned model is designed for efficient deployment on consumer GPUs and workstations.
Loading preview...
Gemma 4: Multimodal MoE for Advanced Reasoning and Coding
Google DeepMind's Gemma 4 family introduces multimodal models capable of processing text and image inputs (with audio on smaller variants) to generate text outputs. The google/gemma-4-26B-A4B-it is a 25.2 billion parameter instruction-tuned Mixture-of-Experts (MoE) model, featuring 3.8 billion active parameters for fast inference, making it suitable for consumer GPUs and workstations.
Key Capabilities & Advancements
- Multimodality: Handles Text, Image (variable aspect ratio/resolution), and Video inputs, with native audio support on E2B/E4B models.
- Reasoning: Designed with configurable thinking modes for highly capable reasoning.
- Extended Context Window: Supports a 256K token context window.
- Enhanced Coding & Agentic Capabilities: Achieves significant improvements in coding benchmarks and includes native function-calling support.
- Native System Prompt Support: Enables more structured and controllable conversations.
- Multilingual: Pre-trained on over 140 languages, with out-of-the-box support for 35+ languages.
Performance Highlights
The 26B A4B MoE model demonstrates strong performance across various benchmarks, including:
- MMLU Pro: 82.6%
- AIME 2026 no tools: 88.3%
- LiveCodeBench v6: 77.1%
- GPQA Diamond: 82.3%
- MMMU Pro (Vision): 73.8%
Ideal Use Cases
This model is well-suited for:
- Content Creation: Generating creative text formats, marketing copy, and email drafts.
- Conversational AI: Powering chatbots, virtual assistants, and interactive applications.
- Coding: Code generation, completion, and correction.
- Multimodal Understanding: Object detection, document/PDF parsing, UI understanding, chart comprehension, and OCR.
- Agentic Workflows: Leveraging native function-calling for structured tool use.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.