grayarea/Magistral-Small-2509-Heretic-v1.2
VISIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Mar 13, 2026Architecture:Transformer Cold

grayarea/Magistral-Small-2509-Heretic-v1.2 is a 24 billion parameter language model derived from Magistral-Small-2509, specifically engineered for zero refusals. This model utilizes Heretic v1.2.0 with Magnitude-Preserving Orthogonal Ablation (MPOA) to achieve a low KL divergence of 0.0182 while eliminating content refusals. It maintains comparable perplexity and general benchmark performance to its original counterpart, making it suitable for applications requiring unrestricted content generation.

Loading preview...

Model Overview

grayarea/Magistral-Small-2509-Heretic-v1.2 is a 24 billion parameter language model based on Magistral-Small-2509, distinguished by its "decensored" nature. This version was created using Heretic v1.2.0, focusing on achieving zero refusals with a minimal impact on the model's original characteristics, as indicated by a low KL divergence.

Key Characteristics

  • Zero Refusals: The model demonstrates 0 refusals out of 108 test cases, significantly down from 97/108 in the original model.
  • Low KL Divergence: Achieves a KL divergence of 0.0182 compared to the original model, suggesting a high degree of fidelity to the base model's knowledge and capabilities.
  • Abliteration Parameters: Utilizes a custom Heretic training dataset and configuration, employing Magnitude-Preserving Orthogonal Ablation (MPOA), full row renormalization, and Winsorization Quantile 0.997.
  • Performance: Maintains relative perplexity close to the original model, with a Q8_0 quantized version showing 4.7682 +/- 0.02613 compared to the original's 4.7603 +/- 0.02610. General benchmarks like HellaSwag, Winogrande, and ARC-Challenge also show comparable results.

Use Cases

This model is particularly well-suited for applications where:

  • Unrestricted Content Generation: The primary requirement is to generate responses without content-based refusals.
  • Maintaining Original Model Performance: Users need the core capabilities of Magistral-Small-2509 but with the added flexibility of a decensored output.
  • Research into Model Alignment: It can be a valuable tool for studying the effects of alignment techniques and their impact on model behavior and refusal rates.