The grayarea/Mistral-Small-3.2-24B-Instruct-2506-Heretic-v1.2-2 is a 24 billion parameter instruction-tuned model, derived from Mistral-Small-3.2-24B-Instruct-2506, with a 32768 token context length. It is specifically engineered for zero refusal responses, achieved through Heretic v1.2.0 ablation techniques with a low KL divergence of 0.0189. This model is optimized for applications requiring uncensored and direct responses, making it suitable for use cases where content filtering is not desired.
Loading preview...
Model Overview
The grayarea/Mistral-Small-3.2-24B-Instruct-2506-Heretic-v1.2-2 is a 24 billion parameter instruction-tuned language model, built upon the Mistral-Small-3.2-24B-Instruct-2506 base. Its primary distinguishing feature is its "decensored" nature, achieved through the application of Heretic v1.2.0 ablation techniques. This process specifically targets the reduction of refusal responses, aiming for zero refusals while maintaining a low KL divergence of 0.0189 compared to the original model.
Key Characteristics
- Zero Refusal Design: Engineered to provide direct answers without content-based refusals, demonstrated by 0/108 refusals compared to 96/108 in the original model.
- Low KL Divergence: Achieves its decensored state with a KL divergence of 0.0189, indicating a close statistical similarity to the original model's output distribution.
- Ablation Parameters: Utilizes a custom Heretic training dataset, targeted Heretic configuration, and Magnitude-Preserving Orthogonal Ablation (MPOA) with full row renormalization and Winsorization Quantile 0.997.
- Context Length: Supports a substantial context window of 32768 tokens.
Performance Considerations
While optimized for zero refusals, benchmark comparisons show a slight variation in performance metrics for the Heretic version:
- Perplexity (Wikitext-2): The Heretic Q4_K_M quantized version shows a perplexity of 4.7332, slightly higher than the original Q8_0 (4.6351).
- General Benchmarks: Scores on HellaSwag (82.50%), Winogrande (77.90%), ARC-Challenge (55.18%), and MMLU (43.93%) are comparable to or slightly below the original model's performance, with MMLU excluding certain moral and legal subjects.
Ideal Use Cases
This model is particularly suited for applications where:
- Unfiltered Responses are Required: Scenarios demanding direct, uncensored output without built-in refusal mechanisms.
- Specific Content Generation: Tasks where the model's inherent refusal to generate certain content is a hindrance.
- Research into Model Alignment: Exploring the effects of ablation and decensoring techniques on LLM behavior.