coder3101/gemma-4-31B-it-qat-q4_0-unquantized-heretic

VISIONConcurrency Cost:2Model Size:31BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 6, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The coder3101/gemma-4-31B-it-qat-q4_0-unquantized-heretic is a 31 billion parameter instruction-tuned multimodal language model, a decensored version of Google DeepMind's Gemma 4 model. Utilizing Quantization-Aware Training (QAT) for efficiency, this model is specifically modified using the Heretic tool with Arbitrary-Rank Ablation (ARA) to reduce refusals. It excels in reasoning, coding, and multimodal understanding, supporting text and image inputs with a 256K token context length.

Loading preview...

Model Overview

This model, coder3101/gemma-4-31B-it-qat-q4_0-unquantized-heretic, is a 31 billion parameter instruction-tuned variant of Google DeepMind's Gemma 4 family. It has been specifically modified using the Heretic tool with the Arbitrary-Rank Ablation (ARA) method to significantly reduce refusals, demonstrating 11/100 refusals compared to the original model's 99/100. The base Gemma 4 models are multimodal, handling text and image input, and are optimized with Quantization-Aware Training (QAT) for reduced memory requirements while preserving quality.

Key Capabilities

  • Decensored Behavior: Modified to exhibit fewer refusals compared to the original Gemma 4 model.
  • Multimodality: Processes text and image inputs, with the base Gemma 4 family also supporting audio on smaller variants.
  • Extended Context Window: Features a 256K token context length, enabling complex, long-context tasks.
  • Reasoning & Coding: Designed with strong reasoning capabilities and enhanced coding performance, including native function-calling support.
  • Efficient Architecture: Leverages QAT for optimized performance and memory usage, making it suitable for various deployment scenarios.

Good For

  • Applications requiring a less restrictive, decensored large language model.
  • Tasks involving complex reasoning, code generation, and agentic workflows.
  • Multimodal applications that integrate text and image understanding.
  • Developers seeking a powerful, instruction-tuned model with a large context window for diverse tasks.