google/gemma-4-31B
Gemma 4-31B is a 30.7 billion parameter multimodal language model developed by Google DeepMind, part of the Gemma 4 family. It processes text and image inputs, generating text outputs, and features a 256K token context window. This model is optimized for reasoning, coding, and agentic workflows, offering strong performance in complex tasks.
Loading preview...
Overview
Google DeepMind's Gemma 4 models are a family of open, multimodal models designed for text and image input, with text output. The Gemma 4-31B is a 30.7 billion parameter dense model, while the Gemma 4-26B A4B is a 25.2 billion parameter Mixture-of-Experts (MoE) model with 3.8 billion active parameters, offering faster inference. Both models support a 256K token context window and are multilingual, supporting over 140 languages.
Key Capabilities
- Multimodality: Handles text and image inputs, with variable aspect ratio and resolution support. Smaller E2B/E4B models also support audio and video.
- Reasoning: Designed with configurable thinking modes for step-by-step problem-solving.
- Coding & Agentic Workflows: Enhanced performance in coding benchmarks and native function-calling support for autonomous agents.
- Long Context: Supports up to 256K tokens, utilizing a hybrid attention mechanism for efficiency.
- Native System Prompt Support: Enables more structured and controllable conversations.
Good For
- Complex Reasoning Tasks: Excels in benchmarks like MMLU Pro (85.2%) and AIME 2026 (89.2%).
- Code Generation: Achieves 80.0% on LiveCodeBench v6 and a Codeforces ELO of 2150.
- Multimodal Understanding: Strong performance in MMMLU (88.4%) and MMMU Pro (76.9%) for vision-language tasks.
- Content Creation: Generating creative text, marketing copy, and powering chatbots.
- Research & Development: Serving as a foundation for VLM and NLP research.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.