Name: google/gemma-4-E2B-it API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: google

Model Overview

google/gemma-4-E2B-it is an instruction-tuned variant from the Gemma 4 family of open multimodal models by Google DeepMind. This model is specifically designed for efficient local execution on devices like high-end phones and laptops, featuring 2.3 billion effective parameters and a 128K token context window. It supports text, image, and audio inputs, generating text outputs, and incorporates Per-Layer Embeddings (PLE) for parameter efficiency.

Key Capabilities

Multimodality: Processes text, images (with variable aspect ratio and resolution), and audio inputs (native to E2B/E4B models).
Reasoning: Includes a built-in reasoning mode for step-by-step thinking.
Long Context: Supports a 128K token context window.
Coding & Agentic: Enhanced coding benchmarks and native function-calling support for autonomous agents.
Multilingual: Pre-trained on over 140 languages with out-of-the-box support for 35+ languages.
On-Device Optimization: Smaller models (E2B, E4B) are specifically optimized for efficient local execution.

When to Use This Model

On-device applications: Ideal for deployment on mobile devices and laptops due to its optimized size and efficiency.
Multimodal tasks: Excellent for applications requiring understanding and generation based on combined text, image, and audio inputs.
Reasoning and coding: Suitable for tasks that benefit from structured reasoning and code generation/completion.
Agentic workflows: Supports native function calling, making it a strong candidate for building autonomous agents.

Overview

Model Overview

Key Capabilities

When to Use This Model

Full Model Card (README)