Pranavz/gemma-4-26B-A4B-it-arli-v2

VISIONConcurrency Cost:2Model Size:26BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 19, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The Pranavz/gemma-4-26B-A4B-it-arli-v2 is a 26 billion parameter instruction-tuned multimodal language model, derived from Google DeepMind's Gemma 4 26B A4B-it model. This version has been decensored using Arli-style norm-preserving biprojected obliteration, significantly reducing refusals compared to the original. It features a 256K token context window and excels in reasoning, coding, and multimodal understanding, processing text, images, and video inputs.

Loading preview...

Overview

This model, Pranavz/gemma-4-26B-A4B-it-arli-v2, is a 26 billion parameter instruction-tuned variant of Google DeepMind's Gemma 4 26B A4B-it. It has been modified using "Arli-style norm-preserving biprojected obliteration" to create a decensored version. A key differentiator is its significantly reduced refusal rate (3/100) compared to the original model (100/100), while maintaining a low KL divergence of 0.2535.

Key Capabilities

  • Multimodal Understanding: Processes text, images, and video inputs. The base Gemma 4 models also support audio on smaller variants.
  • Extended Context Window: Features a 256K token context length, enabling processing of long and complex inputs.
  • Reasoning: Designed with configurable thinking modes to enhance reasoning capabilities.
  • Coding & Agentic Capabilities: Shows improvements in coding benchmarks and supports native function-calling for autonomous agents.
  • Multilingual Support: Pre-trained on over 140 languages with out-of-the-box support for 35+ languages.
  • Efficient Architecture: Utilizes a Mixture-of-Experts (MoE) architecture with 25.2B total parameters but only 3.8B active parameters, allowing for faster inference comparable to a 4B model.

Good For

  • Applications requiring a less restrictive, decensored large language model.
  • Tasks involving complex reasoning, code generation, and agentic workflows.
  • Multimodal applications that integrate text, image, and video inputs.
  • Scenarios where a balance between high performance and efficient inference is crucial, thanks to its MoE design.
  • Content creation, chatbots, text summarization, and research in NLP and VLM.