google/gemma-4-31B-it
Gemma 4 31B-it is a 30.7 billion parameter instruction-tuned multimodal language model developed by Google DeepMind. This model handles text and image input, generating text output, and features a 256K token context window. Optimized for reasoning, coding, and agentic workflows, it offers strong performance across various benchmarks including MMLU Pro and LiveCodeBench.
Loading preview...
Gemma 4: Multimodal AI from Google DeepMind
Google DeepMind's Gemma 4 family introduces a suite of open multimodal models, including the 31B-it variant, designed for advanced reasoning, coding, and agentic capabilities. These models process text and image inputs (with audio support on smaller variants) and generate text outputs, featuring an impressive context window of up to 256K tokens and multilingual support across 140+ languages.
Key Capabilities & Advancements
- Multimodality: Processes text, images (with variable aspect ratio and resolution), and video. Smaller E2B and E4B models also natively support audio.
- Reasoning: Designed as highly capable reasoners with configurable thinking modes, allowing step-by-step processing.
- Extended Context: Supports long contexts up to 256K tokens, utilizing a hybrid attention mechanism for efficiency.
- Enhanced Coding & Agentic Features: Achieves significant improvements in coding benchmarks and includes native function-calling support for autonomous agents.
- Native System Prompt Support: Integrates a
systemrole for more structured and controllable conversations.
Performance Highlights
Gemma 4 models demonstrate frontier-level performance across various benchmarks. The 31B model achieves 85.2% on MMLU Pro, 89.2% on AIME 2026 (no tools), and 80.0% on LiveCodeBench v6, showcasing strong capabilities in general reasoning and code generation. Multimodal benchmarks like MMMU Pro also show robust performance at 76.9% for the 31B variant.
Optimized Architectures
The family includes both Dense and Mixture-of-Experts (MoE) architectures. The 31B-it model is a dense variant, while the 26B A4B MoE model offers efficient inference by activating only a 3.8B parameter subset. This diversity allows deployment across a range of environments, from mobile devices to high-end servers.
Intended Usage
- Content Creation: Text generation, chatbots, text summarization, image data extraction.
- Research & Education: NLP and VLM research, language learning tools, knowledge exploration.
- Agentic Workflows: Leveraging function calling for structured tool use.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.