Name: coder3101/gemma-4-31B-it-qat-q4_0-unquantized-heretic API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: coder3101

Model Overview

This model, coder3101/gemma-4-31B-it-qat-q4_0-unquantized-heretic, is a 31 billion parameter instruction-tuned variant of Google DeepMind's Gemma 4 family. It has been specifically modified using the Heretic tool with the Arbitrary-Rank Ablation (ARA) method to significantly reduce refusals, demonstrating 11/100 refusals compared to the original model's 99/100. The base Gemma 4 models are multimodal, handling text and image input, and are optimized with Quantization-Aware Training (QAT) for reduced memory requirements while preserving quality.

Key Capabilities

Decensored Behavior: Modified to exhibit fewer refusals compared to the original Gemma 4 model.
Multimodality: Processes text and image inputs, with the base Gemma 4 family also supporting audio on smaller variants.
Extended Context Window: Features a 256K token context length, enabling complex, long-context tasks.
Reasoning & Coding: Designed with strong reasoning capabilities and enhanced coding performance, including native function-calling support.
Efficient Architecture: Leverages QAT for optimized performance and memory usage, making it suitable for various deployment scenarios.

Good For

Applications requiring a less restrictive, decensored large language model.
Tasks involving complex reasoning, code generation, and agentic workflows.
Multimodal applications that integrate text and image understanding.
Developers seeking a powerful, instruction-tuned model with a large context window for diverse tasks.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)