ApocalypseParty/G4-31B-SFT-v3-1-1ep
ApocalypseParty/G4-31B-SFT-v3-1-1ep is a 31 billion parameter instruction-tuned multimodal model from the Google DeepMind Gemma 4 family. It features a 256K token context window and excels at reasoning, coding, and multimodal understanding, processing text and image inputs to generate text outputs. This model is designed for scalable deployment on consumer GPUs and workstations, offering enhanced agentic capabilities and native system prompt support.
Loading preview...
Overview
ApocalypseParty/G4-31B-SFT-v3-1-1ep is a 31 billion parameter instruction-tuned model from the Gemma 4 family, developed by Google DeepMind. This multimodal model handles text and image inputs, generating text outputs, and supports a substantial 256K token context window. It is part of a family that includes both Dense and Mixture-of-Experts (MoE) architectures, with this specific model being a Dense variant.
Key Capabilities
- Multimodal Understanding: Processes text and image inputs, with variable aspect ratio and resolution support. Video understanding is also supported by processing sequences of frames.
- Reasoning: Designed with configurable thinking modes, allowing the model to reason step-by-step.
- Extended Context: Features a 256K token context window, enabling deep awareness for complex, long-context tasks.
- Enhanced Coding & Agentic Capabilities: Shows significant improvements in coding benchmarks and includes native function-calling support for autonomous agents.
- Native System Prompt Support: Integrates native support for the
systemrole, facilitating more structured and controllable conversations. - Multilingual: Pre-trained on over 140 languages with out-of-the-box support for 35+ languages.
Performance Highlights
The 31B model demonstrates strong performance across various benchmarks:
- MMLU Pro: 85.2%
- AIME 2026 no tools: 89.2%
- LiveCodeBench v6: 80.0%
- GPQA Diamond: 84.3%
- MMMU Pro (Vision): 76.9%
Good For
- Complex Reasoning Tasks: Its built-in reasoning mode and strong benchmark performance make it suitable for tasks requiring logical deduction.
- Advanced Coding: Ideal for code generation, completion, and correction, especially with its improved coding benchmarks.
- Multimodal Applications: Excellent for applications requiring image understanding, such as object detection, document parsing, and chart comprehension, combined with text generation.
- Agentic Workflows: Native function-calling support makes it well-suited for developing highly capable autonomous agents.
- Long-Context Applications: The 256K token context window is beneficial for tasks requiring extensive contextual understanding, such as summarizing long documents or handling multi-turn conversations.