Name: unsloth/gemma-4-26B-A4B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: unsloth

Overview

unsloth/gemma-4-26B-A4B is a 25.2 billion parameter multimodal Mixture-of-Experts (MoE) model from the Gemma 4 family, developed by Google DeepMind. It is designed for efficient deployment, activating only 3.8 billion parameters during inference, making it significantly faster than its total parameter count suggests. This model supports text and image inputs, with a substantial 256K token context window, and is built for frontier-level performance in its size class.

Key Capabilities

Multimodal Understanding: Processes text and image inputs, with variable aspect ratio and resolution support. It can analyze video by processing sequences of frames.
Reasoning: Features configurable thinking modes for step-by-step problem-solving.
Coding & Agentic Capabilities: Achieves notable improvements in coding benchmarks and includes native function-calling support for autonomous agents.
Long Context: Supports a 256K token context window, utilizing a hybrid attention mechanism for efficient processing of long sequences.
Efficient Architecture: As an MoE model, it offers fast inference speeds comparable to a 4B parameter model while leveraging a larger total parameter count for performance.

Good For

Reasoning-intensive tasks: Its design emphasizes strong reasoning capabilities.
Coding and agentic workflows: Enhanced coding benchmarks and function-calling support make it suitable for development and automation.
Multimodal applications: Ideal for tasks requiring both text and image understanding, such as document parsing, visual question answering, and video analysis.
Deployment on consumer GPUs and workstations: Optimized for scalable deployment in environments beyond mobile devices.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)