Qwen2.5-1.5B-VibeThinker-heretic-uncensored-abliterated Overview
This model is a 1.5 billion parameter variant of the Qwen2.5 architecture, specifically modified by DavidAU using the "Heretic" v1.0.1 method to remove censorship. The abliteration process successfully reduced the refusal rate from an original 61/100 to just 5/100, while maintaining a very low KL divergence of 0.01, ensuring the model's core functionality remains intact.
Key Characteristics
- Uncensored Output: Designed to generate content without refusals, including potentially graphic, explicit, or otherwise restricted material.
- High Fidelity: A KL divergence of 0.01 indicates that the model's original performance and "root state" are preserved, preventing "brain damage" often associated with de-censoring methods.
- Context Length: Supports a context length of 128k tokens, allowing for extensive conversational or generative tasks.
- Directed Content Generation: While uncensored, the model may require explicit directives (e.g., using specific slang or descriptive terms) to produce content at the desired level of graphic detail or explicitness, especially when compared to models trained on uncensored data.
Usage Recommendations
- Adjust Smoothing Factor: For smoother operation in chat or roleplay, set the
Smoothing_factor to 1.5 in interfaces like KoboldCpp, oobabooga/text-generation-webui, or Silly Tavern. - Repetition Penalty: Consider increasing the repetition penalty to 1.1-1.15 if not using the smoothing factor.
- Optimal Settings: Refer to the provided documentation by DavidAU for advanced settings, samplers, and parameters to maximize performance across various use cases, including chat and roleplay.