grayarea/Mistral-Small-3.2-24B-Instruct-2506-Text-Only-Heretic-v1.2
grayarea/Mistral-Small-3.2-24B-Instruct-2506-Text-Only-Heretic-v1.2 is a 24 billion parameter instruction-tuned text-only model, derived from Mistral-Small-3.2-24B-Instruct-2506. Developed by grayarea, this version has been decensored using Heretic v1.2.0, specifically optimized for zero refusals with a low KL divergence of 0.0138. It is designed for applications requiring unrestricted text generation without vision capabilities.
Loading preview...
Model Overview
This model, grayarea/Mistral-Small-3.2-24B-Instruct-2506-Text-Only-Heretic-v1.2, is a 24 billion parameter instruction-tuned language model based on the Mistral-Small-3.2-24B-Instruct-2506 architecture. It has been specifically modified by grayarea using Heretic v1.2.0 to achieve zero refusals in its responses, distinguishing it from its base model.
Key Differentiators
- Decensored Output: Engineered for zero refusals, providing unrestricted text generation capabilities.
- Low KL Divergence: Maintains a low KL divergence of 0.0138 compared to the original model, indicating minimal deviation in overall distribution while removing refusal behaviors.
- Text-Only: This version explicitly removes the vision functionality present in the original Mistral 3.2 Small Heretic, focusing solely on text-based interactions.
Abliteration Parameters
The model's unique characteristics are a result of specific abliteration parameters, including:
- Custom Heretic training dataset.
- Targeted Heretic configuration.
- Abliteration with MPOA (Magnitude-Preserving Orthogonal Ablation) enabled.
- Full row renormalization and Winsorization Quantile 0.997.
Performance Metrics
| Metric | This Model | Original Model |
|---|---|---|
| KL divergence | 0.0138 | 0 |
| Refusals | 0/108 | 96/108 |
Use Cases
This model is suitable for applications where unrestricted and uncensored text generation is a primary requirement, particularly in scenarios where the base model's refusal rate is prohibitive. Its text-only nature makes it efficient for purely linguistic tasks.