llmfan46/Magistral-Small-2509-ultra-uncensored-heretic-v1

Hugging Face
VISIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Mar 17, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

llmfan46/Magistral-Small-2509-ultra-uncensored-heretic-v1 is a decensored version of Mistral AI's Magistral-Small-2509, a 24 billion parameter model built on Mistral Small 3.2 with added reasoning capabilities and multimodality. This version, created using Heretic v1.2.0 with Arbitrary-Rank Ablation (ARA) method, significantly reduces refusals (5/100 vs 92/100) while preserving original model quality (0.0274 KL divergence). It excels in reasoning, multilingual support, and vision-based tasks, offering a 128k context window.

Loading preview...

Magistral-Small-2509-ultra-uncensored-heretic-v1 Overview

This model is a decensored variant of mistralai/Magistral-Small-2509, developed by llmfan46 using the Heretic v1.2.0 tool with the Arbitrary-Rank Ablation (ARA) method. The primary goal of this modification is to drastically reduce content refusals while maintaining the original model's performance and quality.

Key Differentiators & Performance

  • Decensored Output: Achieves a significant reduction in refusals, with only 5/100 compared to the original model's 92/100, making it suitable for use cases requiring less restrictive content generation.
  • Quality Preservation: Demonstrates a low KL divergence of 0.0274, indicating excellent preservation of the base model's coherence, reasoning ability, and overall quality despite decensoring.
  • Reasoning Capabilities: Inherits and enhances the reasoning capabilities of the base Magistral Small 1.2, which is built upon Mistral Small 3.2 (2506) and includes SFT from Magistral Medium traces and RL.
  • Multimodality: Supports vision inputs, allowing it to analyze images and reason based on visual content in addition to text.
  • Multilingual Support: Capable of handling dozens of languages, including English, French, German, Japanese, Chinese, and many others.
  • Context Window: Features a 128k context window, with good performance expected up to 40k tokens.

Usage & Features

  • Thinking and Instruct Mode: Users can enable a thinking process (inside [THINK]...[/THINK] tags) by using the provided SYSTEM_PROMPT.txt, or disable it for direct responses with other system prompts.
  • Deployment: The base Magistral Small is a 24B parameter model designed to be deployable locally, fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized. GGUF quantizations are available here.

When to Use This Model

  • Uncensored Applications: Ideal for applications where content restrictions are undesirable or need to be minimized.
  • Complex Reasoning Tasks: Suitable for tasks requiring long chains of reasoning, especially when combined with the thinking mode.
  • Multimodal AI: Effective for scenarios involving both text and image analysis.
  • Multilingual Applications: Can be used for tasks across a wide range of languages.