Name: coder3101/gemma-4-26B-A4B-it-heretic API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: coder3101

Model Overview

This model, coder3101/gemma-4-26B-A4B-it-heretic, is a decensored version of Google DeepMind's Gemma 4 26B A4B instruction-tuned model. It was created using the Heretic v1.2.0 tool with the Arbitrary-Rank Ablation (ARA) method, specifically designed to reduce refusals. While the original model had 100/100 refusals, this 'heretic' variant significantly lowers them to 11/100, with a KL divergence of 0.0499 compared to the original.

Key Capabilities

Multimodal Processing: Handles both text and image inputs, generating text outputs. The base Gemma 4 models also support video processing.
Efficient Architecture: Utilizes a Mixture-of-Experts (MoE) design with 25.2 billion total parameters but only 3.8 billion active parameters, allowing for faster inference comparable to a 4B model.
Extended Context Window: Supports a substantial 256K token context length, enabling complex, long-context tasks.
Enhanced Reasoning & Coding: Designed for strong reasoning capabilities, agentic workflows, and improved performance in coding benchmarks like LiveCodeBench v6 (77.1%) and Codeforces ELO (1718).
Native System Prompt Support: Includes native support for the system role, facilitating more structured and controllable conversations.

Good For

Applications requiring a less restrictive, decensored large language model.
Tasks involving complex reasoning, code generation, and agentic workflows.
Multimodal applications that need to process both text and images.
Scenarios where efficient inference is crucial, leveraging the MoE architecture's active parameter count.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)