ApocalypseParty/G4-31B-SFT-v6-1
ApocalypseParty/G4-31B-SFT-v6-1 is a 31 billion parameter instruction-tuned multimodal model from the Gemma 4 family developed by Google DeepMind. This model handles text and image inputs, generating text outputs, and features a 256K token context window. It is optimized for reasoning, coding, and agentic workflows, offering strong performance across various benchmarks including MMLU Pro and LiveCodeBench.
Loading preview...
Overview
ApocalypseParty/G4-31B-SFT-v6-1 is a 31 billion parameter instruction-tuned model from Google DeepMind's Gemma 4 family. It is a multimodal model capable of processing text and image inputs to generate text outputs, featuring a substantial 256K token context window. The model employs a hybrid attention mechanism combining local sliding window attention with global attention for efficient long-context processing.
Key Capabilities
- Multimodality: Processes text and image inputs, with native support for interleaved multimodal prompts. Smaller Gemma 4 models (E2B, E4B) also support audio and video.
- Reasoning: Designed with highly capable reasoning abilities, including configurable thinking modes.
- Coding & Agentic Workflows: Shows significant improvements in coding benchmarks and supports native function calling for autonomous agents.
- Long Context: Features a 256K token context window, enabling complex, long-context tasks.
- Multilingual Support: Pre-trained on over 140 languages with out-of-the-box support for 35+ languages.
- Native System Prompt Support: Introduces native support for the
systemrole for more structured conversations.
Benchmark Highlights
- Achieves 85.2% on MMLU Pro and 89.2% on AIME 2026 no tools.
- Scores 80.0% on LiveCodeBench v6 and a Codeforces ELO of 2150.
- Demonstrates strong vision capabilities with 76.9% on MMMU Pro and 85.6% on MATH-Vision.
Good for
- Content Creation: Generating creative text, code, and marketing copy.
- Conversational AI: Powering chatbots, virtual assistants, and interactive applications.
- Research & Education: Serving as a foundation for VLM and NLP research, and language learning tools.
- Image Understanding: Object detection, document parsing, OCR, and general visual data extraction.
- Agentic Workflows: Utilizing function calling for structured tool use.