Name: unsloth/gemma-4-26B-A4B-it-qat-q4_0-unquantized API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: unsloth

Model Overview

This model is part of the Gemma 4 family, developed by Google DeepMind, featuring a 26 billion parameter Mixture-of-Experts (MoE) architecture. It is optimized with Quantization-Aware Training (QAT) to reduce memory requirements while preserving quality, making it suitable for efficient deployment.

Key Capabilities

Multimodal Understanding: Processes text and image inputs, with variable aspect ratio and resolution support. Video understanding is also supported by processing frame sequences.
Reasoning: Designed with configurable thinking modes for step-by-step problem-solving.
Extended Context Window: Features a 256K token context window for handling long and complex tasks.
Efficient Architecture: The MoE design activates only 3.8 billion parameters during inference, allowing for faster execution compared to its total parameter count.
Enhanced Coding & Agentic Capabilities: Shows improvements in coding benchmarks and includes native function-calling support for autonomous agents.
Multilingual Support: Pre-trained on over 140 languages with out-of-the-box support for 35+ languages.

Good For

Applications requiring efficient multimodal processing (text and image).
Reasoning-intensive tasks and agentic workflows.
Code generation, completion, and correction.
Deployment on consumer GPUs and workstations where memory efficiency is crucial.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)