Model Overview
This model, megabytes/Qwen2.5-0.5B-Instruct-heretic, is a decensored version of the Qwen/Qwen2.5-0.5B-Instruct model, created using the Heretic v1.2.0 tool. It is a 0.49 billion parameter instruction-tuned causal language model built on the Qwen2.5 architecture, featuring a 32,768 token context length.
Key Differentiators & Capabilities
- Decensored Output: Achieves a refusal rate of 3/100 compared to the original model's 91/100, indicating significantly less content filtering.
- Enhanced Core Qwen2.5 Features: Inherits improvements from the Qwen2.5 series, including:
- Increased knowledge and improved capabilities in coding and mathematics.
- Significant advancements in instruction following and generating long texts (up to 8K tokens).
- Better understanding of structured data (e.g., tables) and generation of structured outputs like JSON.
- More resilient to diverse system prompts, enhancing role-play and chatbot condition-setting.
- Multilingual Support: Supports over 29 languages, including Chinese, English, French, Spanish, and more.
Abliteration Parameters
The decensoring process involved specific abliteration parameters, such as direction_index (15.86) and various attn.o_proj and mlp.down_proj weight adjustments, which contribute to its altered refusal behavior.
Use Cases
This model is particularly suited for applications where a less restrictive and more direct response generation is desired, while still benefiting from the robust capabilities of the Qwen2.5 base model in areas like coding, mathematics, and instruction following.