Name: unsloth/gemma-4-26B-A4B-it API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: unsloth

Overview

This model, unsloth/gemma-4-26B-A4B-it, is an instruction-tuned variant of Google DeepMind's Gemma 4 family. It is a 25.2 billion parameter Mixture-of-Experts (MoE) model, but uniquely, it activates only 3.8 billion parameters during inference. This "active parameters" approach allows it to run significantly faster, almost like a 4B model, while retaining the capabilities of a larger model. It features a substantial 256K token context window and supports multimodal inputs, specifically text and images.

Key Capabilities

Efficient Inference: Achieves fast inference speeds by activating only a subset of its total parameters.
Multimodality: Processes both text and image inputs, with variable aspect ratio and resolution support.
Reasoning: Designed with strong reasoning capabilities and configurable thinking modes.
Extended Context: Supports a 256K token context window for complex, long-context tasks.
Enhanced Coding & Agentic Capabilities: Shows improved performance in coding benchmarks and includes native function-calling support for autonomous agents.
Native System Prompt Support: Offers structured and controllable conversations through native system role support.

Good For

Applications requiring fast inference without sacrificing performance on complex tasks.
Multimodal applications that combine text and image understanding, such as document parsing, UI understanding, and OCR.
Agentic workflows and tools that benefit from structured function calling.
Coding tasks including generation, completion, and correction.
Use cases demanding long context understanding and advanced reasoning.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)