Model Overview
This model, DavidAU/Qwen3-4B-2507-Thinking-heretic-abliterated-uncensored, is a 4 billion parameter Qwen3-based language model with a 256k context length. Its primary distinction is the application of the Heretic v1.0.1 method to significantly reduce content refusals. The process achieved a refusal rate of 4/100, a substantial improvement from the original model's 97/100, while maintaining a low KL divergence of 0.05, indicating minimal damage to the model's core functionality.
Key Capabilities
- De-censored Output: Designed to generate content without refusals, offering greater freedom in text generation.
- High Context Length: Supports a 256k token context, allowing for processing and generating longer sequences of text.
- Preserved Quality: A low KL divergence of 0.05 ensures the model's default state and performance are largely intact post-de-censoring.
Usage Considerations
While the model will not refuse requests, it may require explicit direction or "pushing" with specific slang or terms to generate content at desired graphic or explicit levels, especially for x-rated or highly descriptive content. This differs from models trained on uncensored data, which might produce such content more readily by default.
Recommended Settings
For optimal performance, particularly in chat or roleplay scenarios, users are advised to set the "Smoothing_factor" to 1.5 in interfaces like KoboldCpp, oobabooga/text-generation-webui, or Silly Tavern. Further advanced settings and parameter guides are available in the linked documentation for maximizing model performance across various use cases.