grisun0/Qwen2.5-0.5B-Instruct-heretic
The grisun0/Qwen2.5-0.5B-Instruct-heretic is a 0.5 billion parameter instruction-tuned causal language model, based on the Qwen2.5 architecture, with a 32,768 token context length. This model is a 'decensored' version of the original Qwen/Qwen2.5-0.5B-Instruct, created using the Heretic v1.1.0 tool. It significantly reduces refusals compared to its base model, making it suitable for applications requiring less restrictive content generation. The model retains Qwen2.5's enhanced capabilities in coding, mathematics, instruction following, and structured data understanding.
Loading preview...
Model Overview
The grisun0/Qwen2.5-0.5B-Instruct-heretic is a 0.5 billion parameter instruction-tuned causal language model, derived from the Qwen2.5 architecture. Its primary distinction is being a "decensored" version of the original Qwen/Qwen2.5-0.5B-Instruct, achieved through the Heretic v1.1.0 tool. This modification significantly reduces the model's refusal rate, from 93/100 in the original to 16/100 in this version, as measured by KL divergence.
Key Capabilities
- Reduced Refusals: Engineered to provide less restrictive content generation compared to its base model.
- Enhanced Knowledge & Reasoning: Benefits from the Qwen2.5 improvements in coding and mathematics, leveraging specialized expert models.
- Improved Instruction Following: Demonstrates better adherence to instructions and understanding of diverse system prompts.
- Structured Data & Output: Excels at understanding structured data like tables and generating structured outputs, particularly JSON.
- Long Context Support: Supports a full context length of 32,768 tokens and can generate up to 8,192 tokens.
- Multilingual Support: Capable of processing and generating text in over 29 languages, including major global languages.
Use Cases
This model is particularly well-suited for applications where a less constrained response generation is desired, while still benefiting from the robust capabilities of the Qwen2.5 architecture in areas like coding, mathematics, and structured data processing. Its instruction-following and long-context abilities make it versatile for various conversational and text generation tasks.