p-e-w/Qwen3-0.6B-heretic
p-e-w/Qwen3-0.6B-heretic is a 0.8 billion parameter causal language model, derived from Qwen/Qwen3-0.6B and decensored using Heretic v1.4.0. This model retains the Qwen3 architecture, featuring a 32,768 token context length and unique capabilities for switching between 'thinking' and 'non-thinking' modes to optimize performance for complex reasoning or efficient general dialogue. It is specifically designed to reduce refusals compared to its original counterpart, making it suitable for applications requiring less restrictive content generation.
Loading preview...
Model Overview
p-e-w/Qwen3-0.6B-heretic is a 0.8 billion parameter causal language model, built upon the Qwen3-0.6B architecture and decensored using Heretic v1.4.0. This modification significantly reduces content refusals, with the Heretic version showing 6 refusals out of 100 compared to 54/100 in the original Qwen3-0.6B, while maintaining a low KL divergence of 0.0031.
Key Capabilities
- Decensored Output: Offers less restrictive content generation compared to the base Qwen3 model.
- Dual-Mode Operation: Supports seamless switching between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient, general-purpose dialogue. This is controlled via
enable_thinkingparameter or/thinkand/no_thinktags in prompts. - Enhanced Reasoning: In its thinking mode, it provides strong capabilities in mathematics, code generation, and commonsense logical reasoning.
- Agent Capabilities: Excels in tool calling and integration with external tools, particularly when used with Qwen-Agent.
- Multilingual Support: Capable of handling over 100 languages and dialects with strong multilingual instruction following and translation.
- Extended Context Length: Features a substantial 32,768 token context window.
When to Use This Model
This model is ideal for developers seeking a small, efficient language model (0.8B parameters) that offers greater flexibility in content generation due to its decensored nature. Its dual-mode functionality makes it versatile for applications requiring both deep reasoning and quick, general responses. It is particularly well-suited for agentic tasks and multilingual applications where a less restrictive output is desired.