Name: google/gemma-4-26B-A4B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: google

Gemma 4 26B A4B: Multimodal MoE for Reasoning and Coding

Google DeepMind's Gemma 4 26B A4B is a 25.2 billion total parameter Mixture-of-Experts (MoE) model, designed for efficient and powerful multimodal AI. It leverages 3.8 billion active parameters during inference, allowing it to run almost as fast as a 4B-parameter model while delivering performance comparable to larger models. This model supports a substantial 256K token context window and is proficient in over 140 languages.

Key Capabilities

Multimodal Understanding: Processes text, images (with variable aspect ratio and resolution), and video inputs, generating text outputs. It supports interleaved multimodal input.
Reasoning: Features configurable thinking modes for step-by-step problem-solving.
Coding & Agentic Workflows: Achieves significant improvements in coding benchmarks and includes native function-calling support for autonomous agents.
Native System Prompt Support: Enhances structured and controllable conversations.
Long Context: Handles contexts up to 256K tokens.

Benchmark Highlights

MMLU Pro: 82.6%
AIME 2026 (no tools): 88.3%
LiveCodeBench v6: 77.1%
MMMU Pro (Vision): 73.8%

Good For

Reasoning and complex problem-solving.
Code generation, completion, and correction.
Agentic workflows requiring structured tool use.
Applications requiring multimodal input (text, image, video) for text generation.
Deployments where fast inference is crucial, balancing performance with efficiency.

Overview

Gemma 4 26B A4B: Multimodal MoE for Reasoning and Coding

Key Capabilities

Benchmark Highlights

Good For

Full Model Card (README)